Using Truncated Guide RNAs (tru-gRNAs) to Increase Specificity for RNA-Guided Genome Editing

ABSTRACT

Methods for increasing specificity of RNA-guided genome editing, e.g., editing using CRISPR/Cas9 systems, using truncated guide RNAs (tru-gRNAs).

CLAIM OF PRIORITY

This application is a continuation of U.S. patent application Ser. No.16/572,248, filed Sep. 16, 2019, which is a continuation of U.S. patentapplication Ser. No. 15/430,218, filed Feb. 10, 2017, now U.S. Pat. No.10,415,059, which is a continuation of U.S. patent application Ser. No.14/213,723, filed on Mar. 14, 2014, now U.S. Pat. No. 9,567,604, whichclaims the benefit of U.S. patent application Ser. Nos. 61/799,647,filed on Mar. 15, 2013; 61/838,178, filed on Jun. 21, 2013; 61/838,148,filed on Jun. 21, 2013, and 61/921,007, filed on Dec. 26, 2013. Theentire contents of the foregoing are hereby incorporated by reference.

FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with Government support under Grant Nos. DP1GM105378 awarded by the National Institutes of Health. The Governmenthas certain rights in the invention.

SEQUENCE LISTING

This application contains a Sequence Listing that has been submittedelectronically as an XML file named “40174-0007004_SL.XML.” The XMLfile, created on Sep. 6, 2023, is 4,059,242 bytes in size. The materialin the XML file is hereby incorporated by reference in its entirety.

TECHNICAL FIELD

Methods for increasing specificity of RNA-guided genome editing, e.g.,editing using CRISPR/Cas9 systems, using truncated guide RNAs(tru-gRNAs).

BACKGROUND

Recent work has demonstrated that clustered, regularly interspaced,short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems(Wiedenheft et al., Nature 482, 331-338 (2012); Horvath et al., Science327, 167-170 (2010); Terns et al., Curr Opin Microbiol 14, 321-327(2011)) can serve as the basis for performing genome editing inbacteria, yeast and human cells, as well as in vivo in whole organismssuch as fruit flies, zebrafish and mice (Wang et al., Cell 153, 910-918(2013); Shen et al., Cell Res (2013); Dicarlo et al., Nucleic Acids Res(2013); Jiang et 25 al., Nat Biotechnol 31, 233-239 (2013); Jinek etal., Elife 2, e00471 (2013); Hwang et al., Nat Biotechnol 31, 227-229(2013); Cong et al., Science 339, 819-823 (2013); Mali et al., Science339, 823-826 (2013c); Cho et al., Nat Biotechnol 31, 230-232 (2013);Gratz et al., Genetics 194(4):1029-35 (2013)). The Cas9 nuclease from S.pyogenes (hereafter simply Cas9) can be guided via base paircomplementarity between the first 20 nucleotides of an engineered guideRNA (gRNA) and the complementary strand of a target genomic DNA sequenceof interest that lies next to a protospacer adjacent motif (PAM), e.g.,a PAM matching the sequence NGG or NAG (Shen et al., Cell Res (2013);Dicarlo et al., Nucleic Acids Res (2013); Jiang et al., Nat Biotechnol31, 233-239 (2013); Jinek et al., Elife 2, e00471 (2013); Hwang et al.,Nat Biotechnol 31, 227-229 (2013); Cong et al., Science 339, 819-823(2013); Mali et al., Science 339, 823-826 (2013c); Cho et al., NatBiotechnol 31, 230-232 (2013); Jinek et al., Science 337, 816-821(2012)). Previous studies performed in vitro (Jinek et al., Science 337,816-821 (2012)), in bacteria (Jiang et al., Nat Biotechnol 31, 233-239(2013)) and in human cells (Cong et al., Science 339, 819-823 (2013))have shown that Cas9-mediated cleavage can, in some cases, be abolishedby single mismatches at the gRNA/target site interface, particularly inthe last 10-12 nucleotides (nts) located in the 3′ end of the 20 nt gRNAcomplementarity region.

SUMMARY

CRISPR-Cas genome editing uses a guide RNA, which includes both acomplementarity region (which binds the target DNA by base-pairing) anda Cas9-binding region, to direct a Cas9 nuclease to a target DNA (seeFIG. 1 ). The nuclease can tolerate a number of mismatches (up to five,as shown herein) in the complementarity region and still cleave; it ishard to predict the effects of any given single or combination ofmismatches on activity. Taken together, these nucleases can showsignificant off-target effects but it can be challenging to predictthese sites. Described herein are methods for increasing the specificityof genome editing using the CRISPR/Cas system, e.g., using Cas9 orCas9-based fusion proteins. In particular, provided are truncated guideRNAs (tru-gRNAs) that include a shortened target complementarity region(i.e., less than 20 nts, e.g., 17-19 or 17-18 nts of targetcomplementarity, e.g., 17, 18 or 19 nts of target complementarity), andmethods of using the same. As used herein, “17-18 or 17-19” includes 17,18, or 19 nucleotides.

In one aspect, the invention provides a guide RNA molecule (e.g., asingle guide RNA or a crRNA) having a target complementarity region of17-18 or 17-19 nucleotides, e.g., the target complementarity regionconsists of 17-18 or 17-19 nucleotides, e.g., the target complementarityregion consists of 17-18 or 17-19 nucleotides of consecutive targetcomplementarity. In some embodiments, the guide RNA includes acomplementarity region consisting of 17-18 or 17-19 nucleotides that arecomplementary to 17-18 or 17-19 consecutive nucleotides of thecomplementary strand of a selected target genomic sequence. In someembodiments, the target complementarity region consists of 17-18nucleotides (of target complementarity). In some embodiments, thecomplementarity region is complementary to 17 consecutive nucleotides ofthe complementary strand of a selected target sequence. In someembodiments, the complementarity region is complementary to 18consecutive nucleotides of the complementary strand of a selected targetsequence.

In another aspect, the invention provides a ribonucleic acid consistingof the sequence:

(SEQ ID NO: 2404) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUA; (SEQ ID NO: 2407)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUG; or (SEQ ID NO: 2408)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCU; (SEQ ID NO: 1)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAA GGCUAGUCCG(X_(N));(SEQ ID NO: 2) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 3)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 4)(X₁₇₋₁₈)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(X_(N)), (SEQ ID NO: 5)(X₁₇₋₁₈ or X1₇₋₁₉)GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; (SEQ ID NO: 6)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCG GUGC; or(SEQ ID NO: 7) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCG GUGC;wherein X₁₇₋₁₈ or X₁₇₋₁₉ is a sequence (of 17-18 or 17-19 nucleotides)complementary to the complementary strand of a selected target sequence,preferably a target sequence immediately 5′ of a protospacer adjacentmotif (PAM), e.g., NGG NAG or NNGG (see, for example, the configurationin FIG. 1 ), and X_(N) is any sequence, wherein N (in the RNA) can be0-200, e.g., 0-100, 0-50, or 0-20, that does not interfere with thebinding of the ribonucleic acid to Cas9. In no case is the X₁₇₋₁₈ orX₁₇₋₁₉ identical to a sequence that naturally occurs adjacent to therest of the RNA. In some embodiments the RNA includes one or more U,e.g., 1 to 8 or more Us (e.g., U, UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU,UUUUUUUU) at the 3′ end of the molecule, as a result of the optionalpresence of one or more Ts used as a termination signal to terminate RNAPolIII transcription. In some embodiments the RNA includes one or more,e.g., up to 3, e.g., one, two, or three, additional nucleotides at the5′ end of the RNA molecule that is not complementary to the targetsequence. In some embodiments, the target complementarity regionconsists of 17-18 nucleotides (of target complementarity). In someembodiments, the complementarity region is complementary to 17consecutive nucleotides of the complementary strand of a selected targetsequence. In some embodiments, the complementarity region iscomplementary to 18 consecutive

In another aspect, the invention provides DNA molecules encoding theribonucleic acids described herein, and host cells harboring orexpressing the ribonucleic acids or vectors.

In a further aspect, the invention provides methods for increasingspecificity of RNA-guided genome editing in a cell, the methodcomprising contacting the cell with a guide RNA that includes acomplementarity region consisting of 17-18 or 17-19 nucleotides that arecomplementary to 17-18 or 17-19 consecutive nucleotides of thecomplementary strand of a selected target genomic sequence, as describedherein.

In yet another aspect, the invention provides methods for inducing asingle or double-stranded break in a target region of a double-strandedDNA molecule, e.g., in a genomic sequence in a cell. The methods includeexpressing in or introducing into the cell: a Cas9 nuclease or nickase;and a guide RNA that includes a sequence consisting of 17 or 18 or 19nucleotides that are complementary to the complementary strand of aselected target sequence, preferably a target sequence immediately 5′ ofa protospacer adjacent motif (PAM), e.g., NGG NAG or NNGG e.g., aribonucleic acid as described herein.

Also provided herein are methods for modifying a target region of adouble-stranded DNA molecule in a cell. The methods include expressingin or introducing into the cell: a dCas9-heterologous functional domainfusion protein (dCas9-HFD); and a guide RNA that includes acomplementarity region consisting of 17-18 or 17-19 nucleotides that arecomplementary to 17-18 or 17-19 consecutive nucleotides of thecomplementary strand of a selected target genomic sequence, as describedherein.

In some embodiments, the guide RNA is (i) a single guide RNA thatincludes a complementarity region consisting of 17-18 or 17-19nucleotides that are complementary to 17-18 or 17-19 consecutivenucleotides of the complementary strand of a selected target genomicsequence, or (ii) a crRNA that includes a complementarity regionconsisting of 17-18 or 17-19 nucleotides that are complementary to 17-18or 17-19 consecutive nucleotides of the complementary strand of aselected target genomic sequence, and a tracrRNA.

In some embodiments, the target complementarity region consists of 17-18nucleotides (of target complementarity). In some embodiments, thecomplementarity region is complementary to 17 consecutive nucleotides ofthe complementary strand of a selected target sequence. In someembodiments, the complementarity region is complementary to 18consecutive

In no case is the X₁₇₋₁₈ or X₁₇₋₁₉ of any of the molecules describedherein identical to a sequence that naturally occurs adjacent to therest of the RNA. In some embodiments the RNA includes one or more U,e.g., 1 to 8 or more Us (e.g., U, UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU,UUUUUUUU) at the 3′ end of the molecule, as a result of the optionalpresence of one or more Ts used as a termination signal to terminate RNAPolIII transcription. In some embodiments the RNA includes one or more,e.g., up to 3, e.g., one, two, or three, additional nucleotides at the5′ end of the RNA molecule that is not complementary to the targetsequence.

In some embodiments, one or more of the nucleotides of the RNA ismodified, e.g., locked (2′-O-4′-C methylene bridge), is5′-methylcytidine, is 2′-O-methyl-pseudouridine, or in which the ribosephosphate backbone has been replaced by a polyamide chain, e.g., one ormore of the nucleotides within or outside the target complementarityregion X₁₇₋₁₈ or X₁₇₋₁₉. In some embodiments, some or all of thetracrRNA or crRNA, e.g., within or outside the X₁₇₋₁₈ or X₁₇₋₁₉ targetcomplementarity region, comprises deoxyribonucleotides (e.g., is all orpartially DNA, e.g. DNA/RNA hybrids).

In an additional aspect, the invention provides methods for modifying atarget region of a double-stranded DNA molecule, e.g., in a genomicsequence in a cell. The methods include expressing in or introducinginto the cell:

-   -   a dCas9-heterologous functional domain fusion protein        (dCas9-HFD); and    -   a guide RNA that includes a sequence consisting of 17-18 or        17-19 nucleotides that are complementary to the complementary        strand of a selected target sequence, preferably a target        sequence immediately 5′ of a protospacer adjacent motif (PAM),        e.g., NGG NAG or NNGG e.g., a ribonucleic acid as described        herein. In no case is the X₁₇₋₁₈ or X₁₇₋₁₉ identical to a        sequence that naturally occurs adjacent to the rest of the RNA.        In some embodiments the RNA includes one or more, e.g., up to 3,        e.g., one, two, or three, additional nucleotides at the 5′ end        of the RNA molecule that is not complementary to the target        sequence.

In another aspect, the invention provides methods for modifying, e.g.,introducing a sequence specific break into, a target region of adouble-stranded DNA molecule, e.g., in a genomic sequence in a cell. Themethods include expressing in or introducing into the cell: a Cas9nuclease or nickase, or a dCas9-heterologous functional domain fusionprotein (dCas9-HFD);

-   -   a tracrRNA, e.g., comprising or consisting of the sequence        GGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUA        UCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:8) or an active        portion thereof;    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCA CCGAGUCGGUGC        (SEQ ID NO:2405) or an active portion thereof;    -   AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGU        GGCACCGAGUCGGUGC (SEQ ID NO:2407) or an active portion thereof;    -   CAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGA        AAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:2409) or an active portion        thereof;    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUG (SEQ ID NO:2410)        or an active portion thereof;    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA (SEQ ID NO:2411) or an active        portion thereof; or    -   UAGCAAGUUAAAAUAAGGCUAGUCCG (SEQ ID NO:2412) or an active portion        thereof; and    -   a crRNA that includes a sequence consisting of 17-18 or 17-19        nucleotides that are complementary to the complementary strand        of a selected target sequence, preferably a target sequence        immediately 5′ of a protospacer adjacent motif (PAM), e.g., NGG,        NAG, or NNGG; in some embodiments the crRNA has the sequence:

(SEQ ID NO: 2404) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUA; (SEQ ID NO: 2407)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGUUUUG; or (SEQ ID NO: 2408)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCU.

In some embodiments the crRNA is (X₁₇₋₁₈ orX₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUG (SEQ ID NO:2407) and the tracrRNA isGGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:8); the cRNA is (X₁₇₋₁₈ orX₁₇₋₁₉)GUUUUAGAGCUA (SEQ ID NO:2404) and the tracrRNA isUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCA CCGAGUCGGUGC (SEQ IDNO:2405); or the cRNA is (X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCU (SEQ IDNO:2408) and the tracrRNA is

(SEQ ID NO: 2406) AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC.

In no case is the X₁₇₋₁₈ or X₁₇₋₁₉ identical to a sequence thatnaturally occurs adjacent to the rest of the RNA. In some embodimentsthe RNA (e.g., tracrRNA or crRNA) includes one or more U, e.g., 2 to 8or more Us (e.g., U, UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU, UUUUUUUU) atthe 3′ end of the molecule, as a result of the optional presence of oneor more Ts used as a termination signal to terminate RNA PolIIItranscription. In some embodiments the RNA (e.g., tracrRNA or crRNA)includes one or more, e.g., up to 3, e.g., one, two, or three,additional nucleotides at the 5′ end of the RNA molecule that is notcomplementary to the target sequence. In some embodiments, one or moreof the nucleotides of the crRNA or tracrRNA is modified, e.g., locked(2′-O-4′-C methylene bridge), is 5′-methylcytidine, is2′-O-methyl-pseudouridine, or in which the ribose phosphate backbone hasbeen replaced by a polyamide chain, e.g., one or more of the nucleotideswithin or outside the sequence X₁₇₋₁₈ or X₁₇₋₁₉. In some embodiments,some or all of the tracrRNA or crRNA, e.g., within or outside the X₁₇₋₁₈or X₁₇₋₁₉ target complementarity region, comprises deoxyribonucleotides(e.g., is all or partially DNA, e.g. DNA/RNA hybrids).

In some embodiments, the dCas9-heterologous functional domain fusionprotein (dCas9-HFD) comprises a HFD that modifies gene expression,histones, or DNA, e.g., transcriptional activation domain,transcriptional repressors (e.g., silencers such as HeterochromatinProtein 1 (HP1), e.g., HP1α or HP1β), enzymes that modify themethylation state of DNA (e.g., DNA methyltransferase (DNMT) or TETproteins, e.g., TET1), or enzymes that modify histone subunit (e.g.,histone acetyltransferases (HAT), histone deacetylases (HDAC), orhistone demethylases). In preferred embodiments, the heterologousfunctional domain is a transcriptional activation domain, e.g., a VP64or NF-κB p65 transcriptional activation domain; an enzyme that catalyzesDNA demethylation, e.g., a TET protein family member or the catalyticdomain from one of these family members; or histone modification (e.g.,LSD1, histone methyltransferase, HDACs, or HATs) or a transcriptionsilencing domain, e.g., from Heterochromatin Protein 1 (HP1), e.g., HP1αor HP1β; or a biological tether, e.g., MS2, CRISPR/Cas Subtype Ypestprotein 4 (Csy4) or lambda N protein. dCas9-HFD are described in a U.S.Provisional patent application U.S. Ser. No. 61/799,647, Filed on Mar.15, 2013, Attorney docket no. 00786-0882P02, U.S. Ser. No. 61/838,148,filed on Jun. 21, 2013, and PCT International Application No.PCT/US14/27335, all of which are incorporated herein by reference in itsentirety.

In some embodiments, the methods described herein result in an indelmutation or sequence alteration in the selected target genomic sequence.

In some embodiments, the cell is a eukaryotic cell, e.g., a mammaliancell, e.g., a human cell.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Methods and materials aredescribed herein for use in the present invention; other, suitablemethods and materials known in the art can also be used. The materials,methods, and examples are illustrative only and not intended to belimiting. All publications, patent applications, patents, sequences,database entries, and other references mentioned herein are incorporatedby reference in their entirety. In case of conflict, the presentspecification, including definitions, will control.

Other features and advantages of the invention will be apparent from thefollowing detailed description and figures, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 : Schematic illustrating a gRNA/Cas9 nuclease complex bound toits target DNA site. Scissors indicate approximate cleavage points ofthe Cas9 nuclease on the genomic DNA target site. Note the numbering ofnucleotides on the guide RNA proceeds in an inverse fashion from 5′ to3′. Figure discloses SEQ ID NOS 2691-2693, respectively, in order ofappearance.

FIG. 2A: Schematic illustrating a rationale for truncating the 5′complementarity region of a gRNA. Thick grey lines=target DNA site, thindark grey line structure=gRNA, black lines show base pairing (or lackthereof) between gRNA and target DNA site.

FIG. 2B: Schematic overview of the EGFP disruption assay. Repair oftargeted Cas9-mediated double-stranded breaks in a single integratedEGFP-PEST reporter gene by error-prone NHEJ-mediated repair leads toframe-shift mutations that disrupt the coding sequence and associatedloss of fluorescence in cells.

FIGS. 2C-F: Activities of RNA-guided nucleases (RGNs) harboring singleguide RNAs (gRNAs) bearing (C) single mismatches, (D) adjacent doublemismatches, (E) variably spaced double mismatches, and (F) increasingnumbers of adjacent mismatches assayed on three different target sitesin the EGFP reporter gene sequence. Mean activities of replicates areshown, normalized to the activity of a perfectly matched single gRNA.Error bars indicate standard errors of the mean. Positions mismatched ineach single gRNA are highlighted in grey in the grid below. Sequences ofthe three EGFP target sites were as follows:

(SEQ ID NO: 9) EGFP Site 1 GGGCACGGGCAGCTTGCCGGTGG (SEQ ID NO: 10)EGFP Site 2 GATGCCGTTCTTCTGCTTGTCGG (SEQ ID NO: 11) EGFP Site 3GGTGGTGCAGATGAACTTCAGGG

FIG. 2G: Mismatches at the 5′ end of the gRNA make CRISPR/Cas moresensitive more 3′ mismatches. The gRNAs Watson-Crick base pair betweenthe RNA&DNA with the exception of positions indicated with an “m” whichare mismatched using the Watson-Crick transversion (i.e., EGFP Site #2M18-19 is mismatched by changing the gRNA to its Watson-Crick partner atpositions 18 & 19. Although positions near the 5′ of the gRNA aregenerally very well tolerated, matches in these positions are importantfor nuclease activity when other residues are mismatched. When all fourpositions are mismatched, nuclease activity is no longer detectable.This further demonstrates that matches at these 5′ position can helpcompensate for mismatches at other more 3′ positions. Note theseexperiments were performed with a non-codon optimized version of Cas9which can show lower absolute levels of nuclease activity as compared tothe codon optimized version.

FIG. 2H: Efficiency of Cas9 nuclease activities directed by gRNAsbearing variable length complementarity regions ranging from 15 to 25nts in a human cell-based U2OS EGFP disruption assay. Expression of agRNA from the U6 promoter requires the presence of a 5′ G and thereforeit was only possible to evaluate gRNAs harboring certain lengths ofcomplementarity to the target DNA site (15, 17, 19, 20, 21, 23, and 25nts). Figure discloses SEQ ID NOS 2694-2700, respectively, in order ofappearance.

FIG. 3A: Efficiencies of EGFP disruption in human cells mediated by Cas9and full-length or shortened gRNAs for four target sites in the EGFPreporter gene. Lengths of complementarity regions and correspondingtarget DNA sites are shown. Ctrl=control gRNA lacking a complementarityregion. Figure discloses SEQ ID NOS 2701, 9, 2702, 10, 2703-2704, 11 and2705-2707, respectively, in order of appearance.

FIG. 3B: Efficiencies of targeted indel mutations introduced at sevendifferent human endogenous gene targets by matched standard RGNs (Cas9and standard full-length gRNAs) and tru-RGNs (Cas9 and gRNAs bearingtruncations in their 5′ complementarity regions). Lengths of gRNAcomplementarity regions and corresponding target DNA sites are shown.Indel frequencies were measured by T7EI assay. Ctrl=control gRNA lackinga complementarity region. Figure discloses SEQ ID NOS 2708-2721,respectively, in order of appearance.

FIG. 3C: DNA sequences of indel mutations induced by RGNs using atru-gRNA or a matched full-length gRNA targeted to the EMX1 site. Theportion of the target DNA site that interacts with the gRNAcomplementarity region is highlighted in grey with the first base of thePAM sequence shown in lowercase. Deletions are indicated by dasheshighlighted in grey and insertions by italicized letters highlighted ingrey. The net number of bases deleted or inserted and the number oftimes each sequence was isolated are shown to the right. Figurediscloses SEQ ID NOS 2722-2754, respectively, in order of appearance.

FIG. 3D: Efficiencies of precise HDR/ssODN-mediated alterationsintroduced at two endogenous human genes by matched standard andtru-RGNs. % HDR was measured using a BamHI restriction digest assay (seethe Experimental Procedures for Example 2). Control gRNA=empty U6promoter vector.

FIG. 3E: U2OS.EGFP cells were transfected with variable amounts offull-length gRNA expression plasmids (top) or tru-gRNA expressionplasmids (bottom) together with a fixed amount of Cas9 expressionplasmid and then assayed for percentage of cells with decreased EGFPexpression. Mean values from duplicate experiments are shown withstandard errors of the mean. Note that the data obtained with tru-gRNAmatches closely with data from experiments performed with full-lengthgRNA expression plasmids instead of tru-gRNA plasmids for these threeEGFP target sites.

FIG. 3F: U2OS.EGFP cells were transfected with variable amount of Cas9expression plasmid together with fixed amounts of full-length gRNAexpression plasmids (top) or tru-gRNA expression plasmids (bottom) foreach target (amounts determined for each tru-gRNA from the experimentsof FIG. 3E). Mean values from duplicate experiments are shown withstandard errors of the mean. Note that the data obtained with tru-gRNAmatches closely with data from experiments performed with full-lengthgRNA expression plasmids instead of tru-gRNA plasmids for these threeEGFP target sites. The results of these titrations determined theconcentrations of plasmids used in the EGFP disruption assays performedin Examples 1 and 2.

FIG. 4A: Schematic illustrating locations of VEGFA sites 1 and 4targeted by gRNAs for paired double nicks. Target sites for thefull-length gRNAs are underlined with the first base in the PAM sequenceshown in lowercase. Location of the BamHI restriction site inserted byHDR with a ssODN donor is shown. Figure discloses SEQ ID NOS 2755-2756,respectively, in order of appearance.

FIG. 4B: A tru-gRNA can be used with a paired nickase strategy toefficiently induce indel mutations. Substitution of a full-length gRNAfor VEGFA site 1 with a tru-gRNA does not reduce the efficiency of indelmutations observed with a paired full-length gRNA for VEGFA site 4 andCas9-D10A nickases. Control gRNA used is one lacking a complementarityregion.

FIG. 4C: A tru-gRNA can be used with a paired nickase strategy toefficiently induce precise HDR/ssODN-mediated sequence alterations.Substitution of a full-length gRNA for VEGFA site 1 with a tru-gRNA doesnot reduce the efficiency of indel mutations observed with a pairedfull-length gRNA for VEGFA site 4 and Cas9-D10A nickases with an ssODNdonor template. Control gRNA used is one lacking a complementarityregion.

FIG. 5A: Activities of RGNs targeted to three sites in EGFP usingfull-length (top) or tru-gRNAs (bottom) with single mismatches at eachposition (except at the 5′-most base which must remain a G for efficientexpression from the U6 promoter). Grey boxes in the grid below representpositions of the Watson-Crick transversion mismatches. Empty gRNAcontrol used is a gRNA lacking a complementarity region. RGN activitieswere measured using the EGFP disruption assay and values shown representthe percentage of EGFP-negative observed relative to an RGN using aperfectly matched gRNA. Experiments were performed in duplicate andmeans with error bars representing standard errors of the mean areshown.

FIG. 5B: Activities of RGNs targeted to three sites in EGFP usingfull-length (top) or tru-gRNAs (bottom) with adjacent double mismatchesat each position (except at the 5′-most base which must remain a G forefficient expression from the U6 promoter). Data presented as in 5A.

FIG. 6A: Absolute frequencies of on- and off-target indel mutationsinduced by RGNs targeted to three different endogenous human gene sitesas measured by deep sequencing. Indel frequencies are shown for thethree target sites from cells in which targeted RGNs with a full-lengthgRNA, a tru-gRNA, or a control gRNA lacking a complementarity regionwere expressed. Absolute counts of indel mutations used to make thesegraphs can be found in Table 3B.

FIG. 6B: Fold-improvements in off-target site specificities of threetru-RGNs. Values shown represent the ratio of on/off-target activitiesof tru-RGNs to on/off-target activities of standard RGNs for theoff-target sites shown, calculated using the data from (A) and Table 3B.For the sites marked with an asterisk (*), no indels were observed withthe tru-RGN and therefore the values shown represent conservativestatistical estimates for the fold-improvements in specificities forthese off-target sites (see Results and Experimental Procedures).

FIG. 6C, top: Comparison of the on-target and an off-target siteidentified by T7EI assay for the tru-RGN targeted to VEGFA site 1 (morewere identified by deep sequencing). Note that the full-length gRNA ismismatched to the two nucleotides at the 5′ end of the target site andthat these are the two nucleotides not present in the tru-gRNA targetsite. Mismatches in the off-target site relative to the on-target arehighlighted in bold underlined text. Mismatches between the gRNAs andthe off-target site are shown with X's. Figure discloses SEQ ID NOS 2757and 2758, respectively, in order of appearance.

FIG. 6C, bottom: Indel mutation frequencies induced in the off-targetsite by RGNs bearing full-length or truncated gRNAs. Indel mutationfrequencies were determined by T7EI assay. Note that the off-target sitein this figure is one that we had examined previously for indelmutations induced by the standard RGN targeted to VEGFA site 1 anddesignated as site OT1-30 in that earlier study (Example 1 and Fu etal., Nat Biotechnol. 31(9):822-6 (2013)). It is likely that we did notidentify off-target mutations at this site in our previous experimentsbecause the frequency of indel mutations appears to be at the reliabledetection limit of the T7EI assay (2-5%). Figure discloses SEQ ID NOS2759 and 2760, respectively, in order of appearance.

FIGS. 7A-D: DNA sequences of indel mutations induced by RGNs usingtru-gRNAs or matched full-length gRNAs targeted to VEGFA sites 1 and 3.Sequences depicted as in FIG. 3C. FIGS. 7A-D disclose SEQ ID NOS2761-2888, respectively, in order of appearance.

FIG. 7E. Indel mutation frequencies induced by tru-gRNAs bearing amismatched 5′ G nucleotide. Indel mutation frequencies in humanU2OS.EGFP cells induced by Cas9 directed by tru-gRNAs bearing 17, 18 or20 nt complementarity regions for VEGFA sites 1 and 3 and EMX1 site 1are shown. Three of these gRNAs contain a mismatched 5′ G (indicated bypositions marked in bold text). Bars indicate results from experimentsusing full-length gRNA (20 nt), tru-gRNA (17 or 18 nt), and tru-gRNAwith a mismatched 5′ G nucleotide (17 or 18 nt with boldface Tat 5′end). (Note that no activity was detectable for the mismatched tru-gRNAto EMX1 site 1.) Figure discloses SEQ ID NOS 2890-2898, respectively, inorder of appearance.

FIGS. 8A-C: Sequences of off-target indel mutations induced by RGNs inhuman U2OS.EGFP cells. Wild-type genomic off-target sites recognized byRGNs (including the PAM sequence) are highlighted in grey and numberedas in Table 1 and Table B. Note that the complementary strand is shownfor some sites. Deleted bases are shown as dashes on a grey background.Inserted bases are italicized and highlighted in grey. FIGS. 8A-Cdisclose SEQ ID NOS 2899-2974, respectively, in order of appearance.

FIGS. 9A-C: Sequences of off-target indel mutations induced by RGNs inhuman HEK293 cells. Wild-type genomic off-target sites recognized byRGNs (including the PAM sequence) are highlighted in grey and numberedas in Table 1 and Table B. Note that the complementary strand is shownfor some sites. Deleted bases are shown as dashes on a grey background.Inserted bases are italicized and highlighted in grey. *Yielded a largenumber of single bp indels. FIGS. 9A-C disclose SEQ ID NOS 2975-3037 and2889, respectively, in order of appearance.

DETAILED DESCRIPTION

CRISPR RNA-guided nucleases (RGNs) have rapidly emerged as a facile andefficient platform for genome editing. Although Marraffini andcolleagues (Jiang et al., Nat Biotechnol 31, 233-239 (2013)) recentlyperformed a systematic investigation of Cas9 RGN specificity inbacteria, the specificities of RGNs in human cells have not beenextensively defined. Understanding the scope of RGN-mediated off-targeteffects in human and other eukaryotic cells will be critically essentialif these nucleases are to be used widely for research and therapeuticapplications. The present inventors have used a human cell-basedreporter assay to characterize off-target cleavage of Cas9-based RGNs.Single and double mismatches were tolerated to varying degrees dependingon their position along the guide RNA (gRNA)-DNA interface. Off-targetalterations induced by four out of six RGNs targeted to endogenous lociin human cells were readily detected by examination of partiallymismatched sites. The off-target sites identified harbor up to fivemismatches and many are mutagenized with frequencies comparable to (orhigher than) those observed at the intended on-target site. Thus RGNsare highly active even with imperfectly matched RNA-DNA interfaces inhuman cells, a finding that might confound their use in research andtherapeutic applications.

The results described herein reveal that predicting the specificityprofile of any given RGN is neither simple nor straightforward. The EGFPreporter assay experiments show that single and double mismatches canhave variable effects on RGN activity in human cells that do notstrictly depend upon their position(s) within the target site. Forexample, consistent with previously published reports, alterations inthe 3′ half of the sgRNA/DNA interface generally have greater effectsthan those in the 5′ half (Jiang et al., Nat Biotechnol 31, 233-239(2013); Cong et al., Science 339, 819-823 (2013); Jinek et al., Science337, 816-821 (2012)); however, single and double mutations in the 3′ endsometimes also appear to be well tolerated whereas double mutations inthe 5′ end can greatly diminish activities. In addition, the magnitudeof these effects for mismatches at any given position(s) appears to besite-dependent. Comprehensive profiling of a large series of RGNs withtesting of all possible nucleotide substitutions (beyond theWatson-Crick transversions used in our EGFP reporter experiments) mayhelp provide additional insights into the range of potentialoff-targets. In this regard, the recently described bacterial cell-basedmethod of Marraffini and colleagues (Jiang et al., Nat Biotechnol 31,233-239 (2013)) or the in vitro, combinatorial library-based cleavagesite-selection methodologies previously applied to ZFNs by Liu andcolleagues (Pattanayak et al., Nat Methods 8, 765-770 (2011)) might beuseful for generating larger sets of RGN specificity profiles.

Despite these challenges in comprehensively predicting RGNspecificities, it was possible to identify bona fide off-targets of RGNsby examining a subset of genomic sites that differed from the on-targetsite by one to five mismatches. Notably, under conditions of theseexperiments, the frequencies of RGN-induced mutations at many of theseoff-target sites were similar to (or higher than) those observed at theintended on-target site, enabling the detection of mutations at thesesites using the T7EI assay (which, as performed in our laboratory, has areliable detection limit of ˜2 to 5% mutation frequency). Because thesemutation rates were very high, it was possible to avoid using deepsequencing methods previously required to detect much lower frequencyZFN- and TALEN-induced off-target alterations (Pattanayak et al., NatMethods 8, 765-770 (2011); Perez et al., Nat Biotechnol 26, 808-816(2008); Gabriel et al., Nat Biotechnol 29, 816-823 (2011); Hockemeyer etal., Nat Biotechnol 29, 731-734 (2011)). Analysis of RGN off-targetmutagenesis in human cells also confirmed the difficulties of predictingRGN specificities—not all single and double mismatched off-target sitesshow evidence of mutation whereas some sites with as many as fivemismatches can also show alterations. Furthermore, the bona fideoff-target sites identified do not exhibit any obvious bias towardtransition or transversion differences relative to the intended targetsequence (Table E; grey highlighted rows).

Although off-target sites were seen for a number of RGNs, identificationof these sites was neither comprehensive nor genome-wide in scale. Forthe six RGNs studied, only a very small subset of the much larger totalnumber of potential off-target sequences in the human genome (sites thatdiffer by three to six nucleotides from the intended target site;compare Tables E and C) was examined. Although examining such largenumbers of loci for off-target mutations by T7EI assay is neither apractical nor a cost-effective strategy, the use of high-throughputsequencing in future studies might enable the interrogation of largernumbers of candidate off-target sites and provide a more sensitivemethod for detecting bona fide off-target mutations. For example, suchan approach might enable the unveiling of additional off-target sitesfor the two RGNs for which we failed to uncover any off-targetmutations. In addition, an improved understanding both of RGNspecificities and of any epigenomic factors (e.g., DNA methylation andchromatin status) that may influence RGN activities in cells might alsoreduce the number of potential sites that need to be examined andthereby make genome-wide assessments of RGN off-targets more practicaland affordable.

As described herein, a number of strategies can be used to minimize thefrequencies of genomic off-target mutations. For example, the specificchoice of RGN target site can be optimized; given that off-target sitesthat differ at up to five positions from the intended target site can beefficiently mutated by RGNs, choosing target sites with minimal numbersof off-target sites as judged by mismatch counting seems unlikely to beeffective; thousands of potential off-target sites that differ by fouror five positions within the 20 bp RNA:DNA complementarity region willtypically exist for any given RGN targeted to a sequence in the humangenome (see, for example, Table C). It is also possible that thenucleotide content of the gRNA complementarity region might influencethe range of potential off-target effects. For example, high GC-contenthas been shown to stabilize RNA:DNA hybrids (Sugimoto et al.,Biochemistry 34, 11211-11216 (1995)) and therefore might also beexpected to make gRNA/genomic DNA hybridization more stable and moretolerant to mismatches. Additional experiments with larger numbers ofgRNAs will be needed to assess if and how these two parameters (numbersof mismatched sites in the genome and stability of the RNA:DNA hybrid)influence the genome-wide specificities of RGNs. However, it isimportant to note that even if such predictive parameters can bedefined, the effect of implementing such guidelines would be to furtherrestrict the targeting range of RGNs.

One potential general strategy for reducing RGN-induced off-targeteffects might be to reduce the concentrations of gRNA and Cas9 nucleaseexpressed in the cell. This idea was tested using the RGNs for VEGFAtarget sites 2 and 3 in U2OS.EGFP cells; transfecting less sgRNA- andCas9-expressing plasmid decreased the mutation rate at the on-targetsite but did not appreciably change the relative rates of off-targetmutations (Tables 2A and 2B). Consistent with this, high-leveloff-target mutagenesis rates were also observed in two other human celltypes (HEK293 and K562 cells) even though the absolute rates ofon-target mutagenesis are lower than in U2OS.EGFP cells. Thus, reducingexpression levels of gRNA and Cas9 in cells is not likely to provide asolution for reducing off-target effects. Furthermore, these resultsalso suggest that the high rates of off-target mutagenesis observed inhuman cells are not caused by overexpression of gRNA and/or Cas9.

TABLE 2A Indel mutation frequencies at on- and off-target genomic sites induced by different amounts of Cas9- and single gRNA-expressing plasmids for the RGN targeted to VEGFA Target Site 2 250 ng gRNA/12.5 ng gRNA/ 750 ng Cas9 50 ng Cas9 SEQ Mean indel Mean indel IDfrequency frequency Site Sequence NO: (%) ±SEM (%) ±SEM T2GACCCCCTCCACCCCGCCTCCGG 12 50.2 ± 4.9 25.4 ± 4.8 (On-target) OT2-1GACCCCC C CCACCCCGCC C CCGG 13 14.4 ± 3.4  4.2 ± 0.2 OT2-2 G GGCCCCTCCACCCCGCCTCTGG 14 20.0 ± 6.2  9.8 ± 1.1 OT2-6 CTACCCCTCCACCCCGCCTCCGG 15  8.2 ± 1.4  6.0 ± 0.5 OT2-9 G C CCCC ACCCACCCCGCCTCTGG 16 50.7 ± 5.6 16.4 ± 2.1 OT2-15 T ACCCCC CACACCCCGCCTCTGG 17  9.7 ± 4.5  2.1 ± 0.0 OT2-17 ACA CCCC CCCACCCCGCCTCAGG 18 14.0 ± 2.8  7.1 ± 0.0 OT2-19 ATT CCCC CCCACCCCGCCTCAGG 19 17.0 ± 3.3  9.2 ± 0.4 OT2-20 CC CC A CC CCCACCCCGCCTCAGG 20  6.1 ± 1.3 N.D. OT2-23 CG CCC T C C CCACCCCGCCTCCGG21 44.4 ± 6.7 35.1 ± 1.8 OT2-24 CT CCCC AC CCACCCCGCCTCAGG 22 62.8 ± 5.044.1 ± 4.5 OT2-29 TG CCCC TC CCACCCCGCCTCTGG 23 13.8 ± 5.2  5.0 ± 0.2OT2-34 AGG CCCC CA CACCCCGCCTCAGG 24  2.8 ± 1.5 N.D. Amounts of gRNA-and Cas9-expressing plasmids transfected into U2OS.EGFP cells for theseassays are shown at the top of each column. (Note that data for 250 nggRNA/750 ng Cas9 are the same as those presented in Table 1.) Mean indelfrequencies were determined using the T7EI assay from replicate samplesas described in Methods. OT = Off-target sites, numbered as in Table 1and Table B. Mismatches from the on-target site (within the 20 bp regionto which the gRNA hybridizes) are highlighted as bold, underlined text.N.D. = none detected

TABLE 2B Indel mutation frequencies at on- and off-target genomic sitesinduced by different amounts of Cas9- and single gRNA-expressingplasmids for the RGN targeted to VEGFA Target Site 3 250 ng gRNA/12.5 ng gRNA/ 750 ng Cas9 250 ng Cas9 SEQ Mean indel Mean indel IDfrequency frequency Site Sequence NO: (%) ± SEM (%) ± SEM T3GGTGAGTGAGTGTGTGCGTGTGG 25 49.4 ± 3.8 33.0 ± 3.7 (On-target) OT3-1GGTGAGTGAGTGTGTG T GTGAGG 26  7.4 ± 3.4 N.D. OT3-2 A GTGAGTGAGTGTGTG TGTGGGG 27 24.3 ± 9.2  9.8 ± 4.2 OT3-4 GCTGAGTGAGTGT A TGCGTGTGG 2820.9 ± 11.8  4.2 ± 1.2 OT3-9 GGTGAGTGAGTG C GTGCG G GTGG 29  3.2 ± 0.3N.D. OT3-17 GTTGAGTGA A TGTGTGCGTGAGG 30  2.9 ± 0.2 N.D. OT3-18 T GTG GGTGAGTGTGTGCGTGAGG 31 13.4 ± 4.2  4.9 ± 0.0 OT3-20 A G A GAGTGAGTGTGTGCA TGAGG 32 16.7 ± 3.5  7.9 ± 2.4 Amounts of gRNA- and Cas9-expressingplasmids transfected into U2OS.EGFP cells for these assays are shown atthe top of each column. (Note that data for 250 ng gRNA/750 ng Cas9 arethe same as those presented in Table 1.) Mean indel frequencies weredetermined using the T7EI assay from replicate samples as described inMethods. OT = Off-target sites, numbered as in Table 1 and Table B. N.D.= none detected

The finding that significant off-target mutagenesis can be induced byRGNs in three different human cell types has important implications forbroader use of this genome-editing platform. For research applications,the potentially confounding effects of high frequency off-targetmutations will need to be considered, particularly for experimentsinvolving either cultured cells or organisms with slow generation timesfor which the outcrossing of undesired alterations would be challenging.One way to control for such effects might be to utilize multiple RGNstargeted to different DNA sequences to induce the same genomicalteration because off-target effects are not random but instead relatedto the targeted site. However, for therapeutic applications, thesefindings clearly indicate that the specificities of RGNs will need to becarefully defined and/or improved if these nucleases are to be usedsafely in the longer term for treatment of human diseases.

Methods for Improving Specificity

As shown herein, CRISPR-Cas RNA-guided nucleases based on the S.pyogenes Cas9 protein can have significant off-target mutagenic effectsthat are comparable to or higher than the intended on-target activity(Example 1). Such off-target effects can be problematic for research andin particular for potential therapeutic applications. Therefore, methodsfor improving the specificity of CRISPR-Cas RNA guided nucleases (RGNs)are needed.

As described in Example 1, Cas9 RGNs can induce high-frequency indelmutations at off-target sites in human cells (see also Cradick et al.,2013; Fu et al., 2013; Hsu et al., 2013; Pattanayak et al., 2013). Theseundesired alterations can occur at genomic sequences that differ by asmany as five mismatches from the intended on-target site (see Example1). In addition, although mismatches at the 5′ end of the gRNAcomplementarity region are generally better tolerated than those at the3′ end, these associations are not absolute and showsite-to-site-dependence (see Example 1 and Fu et al., 2013; Hsu et al.,2013; Pattanayak et al., 2013). As a result, computational methods thatrely on the number and/or positions of mismatches currently have limitedpredictive value for identifying bona fide off-target sites. Therefore,methods for reducing the frequencies of off-target mutations remain animportant priority if RNA-guided nucleases are to be used for researchand therapeutic applications.

Truncated Guide RNAs (Tru-gRNAs) Achieve Greater Specificity

Guide RNAs generally speaking come in two different systems: System 1,which uses separate crRNA and tracrRNAs that function together to guidecleavage by Cas9, and System 2, which uses a chimeric crRNA-tracrRNAhybrid that combines the two separate guide RNAs in a single system(referred to as a single guide RNA or sgRNA, see also Jinek et al.,Science 2012; 337:816-821). The tracrRNA can be variably truncated and arange of lengths has been shown to function in both the separate system(system 1) and the chimeric gRNA system (system 2). For example, in someembodiments, tracrRNA may be truncated from its 3′ end by at least 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35 or 40 nts. In someembodiments, the tracrRNA molecule may be truncated from its 5′ end byat least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 25, 30, 35 or 40 nts.Alternatively, the tracrRNA molecule may be truncated from both the 5′and 3′ end, e.g., by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20nts on the 5′ end and at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20,25, 30, 35 or 40 nts on the 3′ end. See, e.g., Jinek et al., Science2012; 337:816-821; Mali et al., Science. 2013 Feb. 15; 339(6121):823-6;Cong et al., Science. 2013 Feb. 15; 339(6121):819-23; and Hwang and Fuet al., Nat Biotechnol. 2013 March; 31(3):227-9; Jinek et al., Elife 2,e00471 (2013)). For System 2, generally the longer length chimeric gRNAshave shown greater on-target activity but the relative specificities ofthe various length gRNAs currently remain undefined and therefore it maybe desirable in certain instances to use shorter gRNAs. In someembodiments, the gRNAs are complementary to a region that is withinabout 100-800 bp upstream of the transcription start site, e.g., iswithin about 500 bp upstream of the transcription start site, includesthe transcription start site, or within about 100-800 bp, e.g., withinabout 500 bp, downstream of the transcription start site. In someembodiments, vectors (e.g., plasmids) encoding more than one gRNA areused, e.g., plasmids encoding, 2, 3, 4, 5, or more gRNAs directed todifferent sites in the same region of the target gene.

The present application describes a strategy for improving RGNspecificity based on the seemingly counterintuitive idea of shortening,rather than lengthening, the gRNA complementarity region. These shortergRNAs can induce various types of Cas9-mediated on-target genome editingevents with efficiencies comparable to (or, in some cases, higher than)full-length gRNAs at multiple sites in a single integrated EGFP reportergene and in endogenous human genes. In addition, RGNs using theseshortened gRNAs exhibit increased sensitivity to small numbers ofmismatches at the gRNA-target DNA interface. Most importantly, use ofshortened gRNAs substantially reduces the rates of genomic off-targeteffects in human cells, yielding improvements of specificity as high as5000-fold or more at these sites. Thus, this shortened gRNA strategyprovides a highly effective approach for reducing off-target effectswithout compromising on-target activity and without the need forexpression of a second, potentially mutagenic gRNA. This approach can beimplemented on its own or in conjunction with other strategies such asthe paired nickase method to reduce the off-target effects of RGNs inhuman cells.

Thus, one method to enhance specificity of CRISPR/Cas nucleases shortensthe length of the guide RNA (gRNA) species used to direct nucleasespecificity. Cas9 nuclease can be guided to specific 17-18 nt genomictargets bearing an additional proximal protospacer adjacent motif (PAM),e.g., of sequence NGG, using a guide RNA, e.g., a single gRNA or a crRNA(paired with a tracrRNA), bearing 17 or 18 nts at its 5′ end that arecomplementary to the complementary strand of the genomic DNA target site(FIG. 1 ).

Although one might expect that increasing the length of the gRNAcomplementarity region would improve specificity, the present inventors(Hwang et al., PLoS One. 2013 Jul. 9; 8(7):e68708) and others (Ran etal., Cell. 2013 Sep. 12; 154(6):1380-9) have previously observed thatlengthening the target site complementarity region at the 5′ end of thegRNA actually makes it function less efficiently at the on-target site.

By contrast, experiments in Example 1 showed that gRNAs bearing multiplemismatches within a standard length 5′ complementarity targeting regioncould still induce robust Cas9-mediated cleavage of their target sites.Thus, it was possible that truncated gRNAs lacking these 5′-endnucleotides might show activities comparable to their full-lengthcounterparts (FIG. 2A). It was further speculated that these 5′nucleotides might normally compensate for mismatches at other positionsalong the gRNA-target DNA interface and therefore predicted that shortergRNAs might be more sensitive to mismatches and thus induce lower levelsof off-target mutations (FIG. 2A).

Decreasing the length of the DNA sequence targeted might also decreasethe stability of the gRNA:DNA hybrid, making it less tolerant ofmismatches and thereby making the targeting more specific. That is,truncating the gRNA sequence to recognize a shorter DNA target mightactually result in a RNA-guided nuclease that is less tolerant to evensingle nucleotide mismatches and is therefore more specific and hasfewer unintended off-target effects.

This strategy for shortening the gRNA complementarity region couldpotentially be used with RNA guided proteins other than S. pyogenes Cas9including other Cas proteins from bacteria or archaea as well as Cas9variants that nick a single strand of DNA or have no-nuclease activitysuch as a dCas9 bearing catalytic inactivating mutations in one or bothnuclease domains. This strategy can be applied to systems that utilize asingle gRNA as well as those that use dual gRNAs (e.g., the crRNA andtracrRNA found in naturally occurring systems).

Thus, described herein is a single guide RNA comprising a crRNA fused toa normally trans-encoded tracrRNA, e.g., a single Cas9 guide RNA asdescribed in Mali et al., Science 2013 Feb. 15; 339(6121):823-6, butwith a sequence at the 5′ end that is complementary to fewer than 20nucleotides (nts), e.g., 19, 18, or 17 nts, preferably 17 or 18 nts, ofthe complementary strand to a target sequence immediately 5′ of aprotospacer adjacent motif (PAM), e.g., NGG, NAG, or NNGG. In someembodiments, the shortened Cas9 guide RNA consists of the sequence:

(SEQ ID NO: 2404) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUA; (SEQ ID NO: 2407)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGUUUUG; or (SEQ ID NO: 2408)(X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCU; (SEQ ID NO: 1) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG(X_(N)); (SEQ ID NO: 2)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 3) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 4) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(X_(N)), (SEQ ID NO: 5)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; (SEQ ID NO: 6) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; or (SEQ ID NO: 7)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC;wherein X₁₇₋₁₈ or X₁₇₋₁₉ is the nucleotide sequence complementary to17-18 or 17-19 consecutive nucleotides of the target sequence,respectively. Also described herein are DNAs encoding the shortened Cas9guide RNAs that have been described previously in the literature (Jineket al., Science. 337(6096):816-21 (2012) and Jinek et al., Elife.2:e00471 (2013)).

The guide RNAs can include X_(N) which can be any sequence, wherein N(in the RNA) can be 0-200, e.g., 0-100, 0-50, or 0-20, that does notinterfere with the binding of the ribonucleic acid to Cas9.

In some embodiments, the guide RNA includes one or more Adenine (A) orUracil (U) nucleotides on the 3′ end. In some embodiments the RNAincludes one or more U, e.g., 1 to 8 or more Us (e.g., U, UU, UUU, UUUU,UUUUU, UUUUUU, UUUUUUU, UUUUUUUU) at the 3′ end of the molecule, as aresult of the optional presence of one or more Ts used as a terminationsignal to terminate RNA PolIII transcription.

Modified RNA oligonucleotides such as locked nucleic acids (LNAs) havebeen demonstrated to increase the specificity of RNA-DNA hybridizationby locking the modified oligonucleotides in a more favorable (stable)conformation. For example, 2′-O-methyl RNA is a modified base wherethere is an additional covalent linkage between the 2′ oxygen and 4′carbon which when incorporated into oligonucleotides can improve overallthermal stability and selectivity (formula I).

Thus in some embodiments, the tru-gRNAs disclosed herein may compriseone or more modified RNA oligonucleotides. For example, the truncatedguide RNAs molecules described herein can have one, some or all of the17-18 or 17-19 nts 5′ region of the guideRNA complementary to the targetsequence are modified, e.g., locked (2′-O-4′-C methylene bridge),5′-methylcytidine, 2′-O-methyl-pseudouridine, or in which the ribosephosphate backbone has been replaced by a polyamide chain (peptidenucleic acid), e.g., a synthetic ribonucleic acid.

In other embodiments, one, some or all of the nucleotides of thetru-gRNA sequence may be modified, e.g., locked (2′-O-4′-C methylenebridge), 5′-methylcytidine, 2′-O-methyl-pseudouridine, or in which theribose phosphate backbone has been replaced by a polyamide chain(peptide nucleic acid), e.g., a synthetic ribonucleic acid.

In a cellular context, complexes of Cas9 with these synthetic gRNAscould be used to improve the genome-wide specificity of the CRISPR/Cas9nuclease system.

Exemplary modified or synthetic tru-gRNAs may comprise, or consist of,the following sequences:

(SEQ ID NO: 2404) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUA(X_(N));(SEQ ID NO: 2407) (X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGUUUUG (X_(N));(SEQ ID NO: 2408) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCU(X_(N));(SEQ ID NO: 1) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG(X_(N)); (SEQ ID NO: 2)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 3) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 4) (X₁₇₋₁₈)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(X_(N)), (SEQ ID NO: 5)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; (SEQ ID NO: 6) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; or (SEQ ID NO: 7)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC;wherein X₁₇₋₁₈ or X₁₇₋₁₉ is a sequence complementary to 17-18 or 17-19nts of a target sequence, respectively, preferably a target sequenceimmediately 5′ of a protospacer adjacent motif (PAM), e.g., NGG, NAG, orNNGG, and further wherein one or more of the nucleotides are locked,e.g., one or more of the nucleotides within the sequence X₁₇₋₁₈ orX₁₇₋₁₉, one or more of the nucleotides within the sequence X_(N), or oneor more of the nucleotides within any sequence of the tru-gRNA. X_(N) isany sequence, wherein N (in the RNA) can be 0-200, e.g., 0-100, 0-50, or0-20, that does not interfere with the binding of the ribonucleic acidto Cas9. In some embodiments the RNA includes one or more U, e.g., 1 to8 or more Us (e.g., U, UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU, UUUUUUUU)at the 3′ end of the molecule, as a result of the optional presence ofone or more Ts used as a termination signal to terminate RNA PolIIItranscription.

Although some of the examples described herein utilize a single gRNA,the methods can also be used with dual gRNAs (e.g., the crRNA andtracrRNA found in naturally occurring systems). In this case, a singletracrRNA would be used in conjunction with multiple different crRNAsexpressed using the present system, e.g., the following: (X₁₇₋₁₈ orX₁₇₋₁₉)GUUUUAGAGCUA (SEQ ID NO:2404); (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUG (SEQ ID NO:2407); or (X₁₇₋₁₈ orX₁₇₋₁₉)GUUUUAGAGCUAUGCU (SEQ ID NO:2408); and a tracrRNA sequence. Inthis case, the crRNA is used as the guide RNA in the methods andmolecules described herein, and the tracrRNA can be expressed from thesame or a different DNA molecule. In some embodiments, the methodsinclude contacting the cell with a tracrRNA comprising or consisting ofthe sequence GGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:8) or an active portionthereof (an active portion is one that retains the ability to formcomplexes with Cas9 or dCas9). In some embodiments, the tracrRNAmolecule may be truncated from its 3′ end by at least 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 15, 20, 25, 30, 35 or 40 nts. In another embodiment, thetracrRNA molecule may be truncated from its 5′ end by at least 1, 2, 3,4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35 or 40 nts. Alternatively, thetracrRNA molecule may be truncated from both the 5′ and 3′ end, e.g., byat least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nts on the 5′ end andat least 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 20, 25, 30, 35 or 40 nts on the3′ end. Exemplary tracrRNA sequences in addition to SEQ ID NO:8 includethe following:

-   -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCA CCGAGUCGGUGC        (SEQ ID NO:2405) or an active portion thereof;    -   AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGU        GGCACCGAGUCGGUGC (SEQ ID NO:2407) or an active portion thereof;    -   CAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGA        AAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:2409) or an active portion        thereof;    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUG (SEQ ID NO:2410)        or an active portion thereof;    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA (SEQ ID NO:2411) or an active        portion thereof; or UAGCAAGUUAAAAUAAGGCUAGUCCG (SEQ ID NO:2412)        or an active portion thereof.

In some embodiments wherein (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUG(SEQ ID NO:2407) is used as a crRNA, the following tracrRNA is used:

-   -   GGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUA        UCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO:8) or an active        portion thereof. In some embodiments wherein (X₁₇₋₁₈ or        X₁₇₋₁₉)GUUUUAGAGCUA (SEQ ID NO:2404) is used as a crRNA, the        following tracrRNA is used:    -   UAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCA CCGAGUCGGUGC        (SEQ ID NO:2405) or an active portion thereof. In some        embodiments wherein (X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCU (SEQ ID        NO:2408) is used as a crRNA, the following tracrRNA is used:    -   AGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGU        GGCACCGAGUCGGUGC (SEQ ID NO:2406) or an active portion thereof.

In addition, in a system that uses separate crRNA and tracrRNA, one orboth can be synthetic and include one or more modified (e.g., locked)nucleotides or deoxyribonucleotides.

In some embodiments, the single guide RNAs and/or crRNAs and/ortracrRNAs can include one or more Adenine (A) or Uracil (U) nucleotideson the 3′ end.

Existing Cas9-based RGNs use gRNA-DNA heteroduplex formation to guidetargeting to genomic sites of interest. However, RNA-DNA heteroduplexescan form a more promiscuous range of structures than their DNA-DNAcounterparts. In effect, DNA-DNA duplexes are more sensitive tomismatches, suggesting that a DNA-guided nuclease may not bind asreadily to off-target sequences, making them comparatively more specificthan RNA-guided nucleases. Thus, the truncated guide RNAs describedherein can be hybrids, i.e., wherein one or more deoxyribonucleotides,e.g., a short DNA oligonucleotide, replaces all or part of the gRNA,e.g., all or part of the complementarity region of a gRNA. ThisDNA-based molecule could replace either all or part of the gRNA in asingle gRNA system or alternatively might replace all of part of thecrRNA in a dual crRNA/tracrRNA system. Such a system that incorporatesDNA into the complementarity region should more reliably target theintended genomic DNA sequences due to the general intolerance of DNA-DNAduplexes to mismatching compared to RNA-DNA duplexes. Methods for makingsuch duplexes are known in the art, See, e.g., Barker et al., BMCGenomics. 2005 Apr. 22; 6:57; and Sugimoto et al., Biochemistry. 2000Sep. 19; 39(37):11270-81.

Exemplary modified or synthetic tru-gRNAs may comprise, or consist of,the following sequences:

(SEQ ID NO: 1) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG(X_(N)); (SEQ ID NO: 2)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 3) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(X_(N)); (SEQ ID NO: 4) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(X_(N)), (SEQ ID NO: 5)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; (SEQ ID NO: 6) (X₁₇₋₁₈ or X₁₇₋₁₉)GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC; or (SEQ ID NO: 7)(X₁₇₋₁₈ or X₁₇₋₁₉) GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC;

-   -   wherein X₁₇₋₁₈ or X₁₇₋₁₉ is a sequence complementary to 17-18 or        17-19 nts of a target sequence, respectively, preferably a        target sequence immediately 5′ of a protospacer adjacent motif        (PAM), e.g., NGG, NAG, or NNGG, and further wherein one or more        of the nucleotides are deoxyribonucleotides, e.g., one or more        of the nucleotides within the sequence X₁₇₋₁₈ or X₁₇₋₁₉, one or        more of the nucleotides within the sequence X_(N), or one or        more of the nucleotides within any sequence of the tru-gRNA.        X_(N) is any sequence, wherein N (in the RNA) can be 0-200,        e.g., 0-100, 0-50, or 0-20, that does not interfere with the        binding of the ribonucleic acid to Cas9. In some embodiments the        RNA includes one or more U, e.g., 1 to 8 or more Us (e.g., U,        UU, UUU, UUUU, UUUUU, UUUUUU, UUUUUUU, UUUUUUUU) at the 3′ end        of the molecule, as a result of the optional presence of one or        more Ts used as a termination signal to terminate RNA PolIII        transcription.

In addition, in a system that uses separate crRNA and tracrRNA, one orboth can be synthetic and include one or more deoxyribonucleotides.

In some embodiments, the single guide RNAs or crRNAs or tracrRNAsincludes one or more Adenine (A) or Uracil (U) nucleotides on the 3′end.

In some embodiments, the gRNA is targeted to a site that is at leastthree or more mismatches different from any sequence in the rest of thegenome in order to minimize off-target effects.

The methods described can include expressing in a cell, or contactingthe cell with, a shortened Cas9 gRNA (tru-gRNA) as described herein(optionally a modified or DNA/RNA hybrid tru-gRNA), plus a nuclease thatcan be guided by the shortened Cas9 gRNAs, e.g., a Cas9 nuclease, e.g.,as described in Mali et al., a Cas9 nickase as described in Jinek etal., 2012; or a dCas9-heterofunctional domain fusion (dCas9-HFD).

Cas9

A number of bacteria express Cas9 protein variants. The Cas9 fromStreptococcus pyogenes is presently the most commonly used; some of theother Cas9 proteins have high levels of sequence identity with the S.pyogenes Cas9 and use the same guide RNAs. Others are more diverse, usedifferent gRNAs, and recognize different PAM sequences as well (the 2-5nucleotide sequence specified by the protein which is adjacent to thesequence specified by the RNA). Chylinski et al. classified Cas9proteins from a large group of bacteria (RNA Biology 10:5, 1-12; 2013),and a large number of Cas9 proteins are listed in supplementary FIG. 1and supplementary table 1 thereof, which are incorporated by referenceherein. Additional Cas9 proteins are described in Esvelt et al., NatMethods. 2013 November; 10(11):1116-21 and Fonfara et al., “Phylogeny ofCas9 determines functional exchangeability of dual-RNA and Cas9 amongorthologous type II CRISPR-Cas systems.” Nucleic Acids Res. 2013 Nov.22. [Epub ahead of print] doi:10.1093/nar/gkt1074.

Cas9 molecules of a variety of species can be used in the methods andcompositions described herein. While the S. pyogenes and S. thermophilusCas9 molecules are the subject of much of the disclosure herein, Cas9molecules of, derived from, or based on the Cas9 proteins of otherspecies listed herein can be used as well. In other words, while themuch of the description herein uses S. pyogenes and S. thermophilus Cas9molecules, Cas9 molecules from the other species can replace them. Suchspecies include those set forth in the following table, which wascreated based on supplementary FIG. 1 of Chylinski et al., 2013.

Alternative Cas9 proteins GenBank Acc No. Bacterium 303229466Veillonella atypica ACS-134-V-Col7a 34762592 Fusobacterium nucleatumsubsp. vincentii 374307738 Filifactor alocis ATCC 35896 320528778Solobacterium moorei F0204 291520705 Coprococcus catus GD-7 42525843Treponema denticola ATCC 35405 304438954 Peptoniphilus duerdenii ATCCBAA-1640 224543312 Catenibacterium mitsuokai DSM 15897 24379809Streptococcus mutans UA159 15675041 Streptococcus pyogenes SF37016801805 Listeria innocua Clip11262 116628213 Streptococcus thermophilusLMD-9 323463801 Staphylococcus pseudintermedius ED99 352684361Acidaminococcus intestini RyC-MR95 302336020 Olsenella uli DSM 7084366983953 Oenococcus kitaharae DSM 17330 310286728 Bifidobacteriumbifidum S17 258509199 Lactobacillus rhamnosus GG 300361537 Lactobacillusgasseri JV-V03 169823755 Finegoldia magna ATCC 29328 47458868 Mycoplasmamobile 163K 284931710 Mycoplasma gallisepticum str. F 363542550Mycoplasma ovipneumoniae SC01 384393286 Mycoplasma canis PG 14 71894592Mycoplasma synoviae 53 238924075 Eubacterium rectale ATCC 33656116627542 Streptococcus thermophilus LMD-9 315149830 Enterococcusfaecalis TX0012 315659848 Staphylococcus lugdunensis M23590 160915782Eubacterium dolichum DSM 3991 336393381 Lactobacillus coryniformissubsp. torquens 310780384 Ilyobacter polytropus DSM 2926 325677756Ruminococcus albus 8 187736489 Akkermansia muciniphila ATCC BAA-835117929158 Acidothermus cellulolyticus 11B 189440764 Bifidobacteriumlongum DJO10A 283456135 Bifidobacterium dentium Bd1 38232678Corynebacterium diphtheriae NCTC 13129 187250660 Elusimicrobium minutumPei191 319957206 Nitratifractor salsuginis DSM 16511 325972003Sphaerochaeta globus str. Buddy 261414553 Fibrobacter succinogenessubsp. succinogenes 60683389 Bacteroides fragilis NCTC 9343 256819408Capnocytophaga ochracea DSM 7271 90425961 Rhodopseudomonas palustrisBisB18 373501184 Prevotella micans F0438 294674019 Prevotella ruminicola23 365959402 Flavobacterium columnare ATCC 49512 312879015 Aminomonaspaucivorans DSM 12260 83591793 Rhodospirillum rubrum ATCC 11170294086111 Candidatus Puniceispirillum marinum IMCC1322 121608211Verminephrobacter eiseniae EF01-2 344171927 Ralstonia syzygii R24159042956 Dinoroseobacter shibae DFL 12 288957741 Azospirillum sp- B51092109262 Nitrobacter hamburgensis X14 148255343 Bradyrhizobium sp- BTAil34557790 Wolinella succinogenes DSM 1740 218563121 Campylobacter jejunisubsp. jejuni 291276265 Helicobacter mustelae 12198 229113166 Bacilluscereus Rock1-15 222109285 Acidovorax ebreus TPSY 189485225 unculturedTermite group 1 182624245 Clostridium perfringens D str. 220930482Clostridium cellulolyticum H10 154250555 Parvibaculum lavamentivoransDS-1 257413184 Roseburia intestinalis L1-82 218767588 Neisseriameningitidis Z2491 15602992 Pasteurella multocida subsp. multocida319941583 Sutterella wadsworthensis 3 1 254447899 gamma proteobacteriumHTCC5015 54296138 Legionella pneumophila str. Paris 331001027Parasutterella excrementihominis YIT 11859 34557932 Wolinellasuccinogenes DSM 1740 118497352 Francisella novicida U112

The constructs and methods described herein can include the use of anyof those Cas9 proteins, and their corresponding guide RNAs or otherguide RNAs that are compatible. The Cas9 from Streptococcus thermophilusLMD-9 CRISPR1 system has also been shown to function in human cells inCong et al (Science 339, 819 (2013)). Cas9 orthologs from N.meningitides are described in Hou et al., Proc Natl Acad Sci USA. 2013Sep. 24; 110(39):15644-9 and Esvelt et al., Nat Methods. 2013 November;10(11):1116-21. Additionally, Jinek et al. showed in vitro that Cas9orthologs from S. thermophilus and L. innocua, (but not from N.meningitidis or C. jejuni, which likely use a different guide RNA), canbe guided by a dual S. pyogenes gRNA to cleave target plasmid DNA,albeit with slightly decreased efficiency.

In some embodiments, the present system utilizes the Cas9 protein fromS. pyogenes, either as encoded in bacteria or codon-optimized forexpression in mammalian cells, containing mutations at D10, E762, H983,or D986 and H840 or N863, e.g., D10A/D10N and H840A/H840N/H840Y, torender the nuclease portion of the protein catalytically inactive;substitutions at these positions could be alanine (as they are inNishimasu al., Cell 156, 935-949 (2014)) or they could be otherresidues, e.g., glutamine, asparagine, tyrosine, serine, or aspartate,e.g., E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H (FIG. 1C). Thesequence of the catalytically inactive S. pyogenes Cas9 that can be usedin the methods and compositions described herein is as follows; theexemplary mutations of D10A and H840A are in bold and underlined.

(SEQ ID NO: 33)        10         20         30         40         50         60MDKKYSIGL A  IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE        70         80         90        100        110        120ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPIFG       130        140        150        160        170        180NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD       190        200        210        220        230        240VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN       250        260        270        280        290        300LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI       310        320        330        340        350        360LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA       370        380        390        400        410        420GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH       430        440        450        460        470        480AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE       490        500        510        520        530        540VVDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL       550        560        570        580        590        600SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI       610        620        630        640        650        660IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG       670        680        690        700        710        720RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL       730        740        750        760        770        780HEHIANLAGS PAIKKGILQT VKVVDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER       790        800        810        820        830        840MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVD A       850        860        870        880        890        900IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEVVKKMK NYWRQLLNAK LITQRKFDNL       910        920        930        940        950        960TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS       970        980        990       1000       1010       1020KLVSDFRKDF QFYKVREINN YHHAHDAYLN AVVGTALIKK YPKLESEFVY GDYKVYDVRK      1030       1040       1050       1060       1070       1080MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF      1090       1100       1110       1120       1130       1140ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA      1150       1160       1170       1180       1190       1200YSVLVVAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK      1210       1220       1230       1240       1250       1260YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE      1270       1280       1290       1300       1310       1320QHKHYLDEII EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA      1330       1340       1350       1360PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD

In some embodiments, the Cas9 nuclease used herein is at least about 50%identical to the sequence of S. pyogenes Cas9, i.e., at least 50%identical to SEQ ID NO:33. In some embodiments, the nucleotide sequencesare about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100%identical to SEQ ID NO:33. In some embodiments, any differences from SEQID NO:33 are in non-conserved regions, as identified by sequencealignment of sequences set forth in Chylinski et al., RNA Biology 10:5,1-12; 2013 (e.g., in supplementary FIG. 1 and supplementary table 1thereof); Esvelt et al., Nat Methods. 2013 November; 10(11):1116-21 andFonfara et al., Nucl. Acids Res. (2014) 42 (4): 2577-2590. [Epub aheadof print 2013 Nov. 22] doi:10.1093/nar/gkt1074.

To determine the percent identity of two sequences, the sequences arealigned for optimal comparison purposes (gaps are introduced in one orboth of a first and a second amino acid or nucleic acid sequence asrequired for optimal alignment, and non-homologous sequences can bedisregarded for comparison purposes). The length of a reference sequencealigned for comparison purposes is at least 50% (in some embodiments,about 50%, 55%, 60%, 65%, 70%, 75%, 85%, 90%, 95%, or 100% of the lengthof the reference sequence is aligned). The nucleotides or residues atcorresponding positions are then compared. When a position in the firstsequence is occupied by the same nucleotide or residue as thecorresponding position in the second sequence, then the molecules areidentical at that position. The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences, taking into account the number of gaps, and the length ofeach gap, which need to be introduced for optimal alignment of the twosequences.

The comparison of sequences and determination of percent identitybetween two sequences can be accomplished using a mathematicalalgorithm. For purposes of the present application, the percent identitybetween two amino acid sequences is determined using the Needleman andWunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has beenincorporated into the GAP program in the GCG software package, using aBlossum 62 scoring matrix with a gap penalty of 12, a gap extend penaltyof 4, and a frameshift gap penalty of 5.

Cas9-HFD

Cas9-HFD are described in a U.S. Provisional patent application U.S.Ser. No. 61/799,647, Filed on Mar. 15, 2013, U.S. Ser. No. 61/838,148,filed on Jun. 21, 2013, and PCT International Application No.PCT/US14/27335, all of which are incorporated herein by reference in itsentirety.

The Cas9-HFD are created by fusing a heterologous functional domain(e.g., a transcriptional activation domain, e.g., from VP64 or NF-κBp65), to the N-terminus or C-terminus of a catalytically inactive Cas9protein (dCas9). In the present case, as noted above, the dCas9 can befrom any species but is preferably from S. pyogenes, In someembodiments, the Cas9 contains mutations in the D10 and H840 residues,e.g., D10N/D10A and H840A/H840N/H840Y, to render the nuclease portion ofthe protein catalytically inactive, e.g., as shown in SEQ ID NO:33above.

The transcriptional activation domains can be fused on the N or Cterminus of the Cas9. In addition, although the present descriptionexemplifies transcriptional activation domains, other heterologousfunctional domains (e.g., transcriptional repressors (e.g., KRAB, ERD,SID, and others, e.g., amino acids 473-530 of the ets2 repressor factor(ERF) repressor domain (ERD), amino acids 1-97 of the KRAB domain ofKOX1, or amino acids 1-36 of the Mad mSIN3 interaction domain (SID); seeBeerli et al., PNAS USA 95:14628-14633 (1998)) or silencers such asHeterochromatin Protein 1 (HP1, also known as swi6), e.g., HP1α or HP1β;proteins or peptides that could recruit long non-coding RNAs (lncRNAs)fused to a fixed RNA binding sequence such as those bound by the MS2coat protein, endoribonuclease Csy4, or the lambda N protein; enzymesthat modify the methylation state of DNA (e.g., DNA methyltransferase(DNMT) or TET proteins); or enzymes that modify histone subunits (e.g.,histone acetyltransferases (HAT), histone deacetylases (HDAC), histonemethyltransferases (e.g., for methylation of lysine or arginineresidues) or histone demethylases (e.g., for demethylation of lysine orarginine residues)) as are known in the art can also be used. A numberof sequences for such domains are known in the art, e.g., a domain thatcatalyzes hydroxylation of methylated cytosines in DNA. Exemplaryproteins include the Ten-Eleven-Translocation (TET)1-3 family, enzymesthat converts 5-methylcytosine (5-mC) to 5-hydroxymethylcytosine (5-hmC)in DNA.

Sequences for human TET1-3 are known in the art and are shown in thefollowing table:

GenBank Accession Nos. Gene Amino Acid Nucleic Acid TET1 NP_085128.2NM_030625.2 TET2* NP_001120680.1 (var 1) NM_001127208.2 NP_060098.3 (var2) NM_017628.4 TET3 NP_659430.1 NM_144993.1 *Variant (1) represents thelonger transcript and encodes the longer isoform (a). Variant (2)differs in the 5′ UTR and in the 3′ UTR and coding sequence compared tovariant 1. The resulting isoform (b) is shorter and has a distinctC-terminus compared to isoform a.

In some embodiments, all or part of the full-length sequence of thecatalytic domain can be included, e.g., a catalytic module comprisingthe cysteine-rich extension and the 2OGFeDO domain encoded by 7 highlyconserved exons, e.g., the Tet1 catalytic domain comprising amino acids1580-2052, Tet2 comprising amino acids 1290-1905 and Tet3 comprisingamino acids 966-1678. See, e.g., FIG. 1 of Iyer et al., Cell Cycle. 2009Jun. 1; 8(11):1698-710. Epub 2009 Jun. 27, for an alignment illustratingthe key catalytic residues in all three Tet proteins, and thesupplementary materials thereof (available at ftp siteftp.ncbi.nih.gov/pub/aravind/DONS/supplementary_material_DONS.html) forfull length sequences (see, e.g., seq 2c); in some embodiments, thesequence includes amino acids 1418-2136 of Tet1 or the correspondingregion in Tet2/3.

Other catalytic modules can be from the proteins identified in Iyer etal., 2009.

In some embodiments, the heterologous functional domain is a biologicaltether, and comprises all or part of (e.g., DNA binding domain from) theMS2 coat protein, endoribonuclease Csy4, or the lambda N protein. Theseproteins can be used to recruit RNA molecules containing a specificstem-loop structure to a locale specified by the dCas9 gRNA targetingsequences. For example, a dCas9 fused to MS2 coat protein,endoribonuclease Csy4, or lambda N can be used to recruit a longnon-coding RNA (lncRNA) such as XIST or HOTAIR; see, e.g., Keryer-Bibenset al., Biol. Cell 100:125-138 (2008), that is linked to the Csy4, MS2or lambda N binding sequence. Alternatively, the Csy4, MS2 or lambda Nprotein binding sequence can be linked to another protein, e.g., asdescribed in Keryer-Bibens et al., supra, and the protein can betargeted to the dCas9 binding site using the methods and compositionsdescribed herein. In some embodiments, the Csy4 is catalyticallyinactive.

In some embodiments, the fusion proteins include a linker between thedCas9 and the heterologous functional domains. Linkers that can be usedin these fusion proteins (or between fusion proteins in a concatenatedstructure) can include any sequence that does not interfere with thefunction of the fusion proteins. In preferred embodiments, the linkersare short, e.g., 2-20 amino acids, and are typically flexible (i.e.,comprising amino acids with a high degree of freedom such as glycine,alanine, and serine). In some embodiments, the linker comprises one ormore units consisting of GGGS (SEQ ID NO:34) or GGGGS (SEQ ID NO:35),e.g., two, three, four, or more repeats of the GGGS (SEQ ID NO:34) orGGGGS (SEQ ID NO:35) unit. Other linker sequences can also be used.

Expression Systems

In order to use the guide RNAs described, it may be desirable to expressthem from a nucleic acid that encodes them. This can be performed in avariety of ways. For example, the nucleic acid encoding the guide RNAcan be cloned into an intermediate vector for transformation intoprokaryotic or eukaryotic cells for replication and/or expression.Intermediate vectors are typically prokaryote vectors, e.g., plasmids,or shuttle vectors, or insect vectors, for storage or manipulation ofthe nucleic acid encoding the guide RNA for production of the guide RNA.The nucleic acid encoding the guide RNA can also be cloned into anexpression vector, for administration to a plant cell, animal cell,preferably a mammalian cell or a human cell, fungal cell, bacterialcell, or protozoan cell.

To obtain expression, a sequence encoding a guide RNA is typicallysubcloned into an expression vector that contains a promoter to directtranscription. Suitable bacterial and eukaryotic promoters are wellknown in the art and described, e.g., in Sambrook et al., MolecularCloning, A Laboratory Manual (3d ed. 2001); Kriegler, Gene Transfer andExpression: A Laboratory Manual (1990); and Current Protocols inMolecular Biology (Ausubel et al., eds., 2010). Bacterial expressionsystems for expressing the engineered protein are available in, e.g., E.coli, Bacillus sp., and Salmonella (Palva et al., 1983, Gene22:229-235). Kits for such expression systems are commerciallyavailable. Eukaryotic expression systems for mammalian cells, yeast, andinsect cells are well known in the art and are also commerciallyavailable.

The promoter used to direct expression of a nucleic acid depends on theparticular application. For example, a strong constitutive promoter istypically used for expression and purification of fusion proteins. Incontrast, when the guide RNA is to be administered in vivo for generegulation, either a constitutive or an inducible promoter can be used,depending on the particular use of the guide RNA. In addition, apreferred promoter for administration of the guide RNA can be a weakpromoter, such as HSV TK or a promoter having similar activity. Thepromoter can also include elements that are responsive totransactivation, e.g., hypoxia response elements, Gal4 responseelements, lac repressor response element, and small molecule controlsystems such as tetracycline-regulated systems and the RU-486 system(see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547;Oligino et al., 1998, Gene Ther., 5:491-496; Wang et al., 1997, GeneTher., 4:432-441; Neering et al., 1996, Blood, 88:1147-55; and Rendahlet al., 1998, Nat. Biotechnol., 16:757-761).

In addition to the promoter, the expression vector typically contains atranscription unit or expression cassette that contains all theadditional elements required for the expression of the nucleic acid inhost cells, either prokaryotic or eukaryotic. A typical expressioncassette thus contains a promoter operably linked, e.g., to the nucleicacid sequence encoding the gRNA, and any signals required, e.g., forefficient polyadenylation of the transcript, transcriptionaltermination, ribosome binding sites, or translation termination.Additional elements of the cassette may include, e.g., enhancers, andheterologous spliced intronic signals.

The particular expression vector used to transport the geneticinformation into the cell is selected with regard to the intended use ofthe gRNA, e.g., expression in plants, animals, bacteria, fungus,protozoa, etc. Standard bacterial expression vectors include plasmidssuch as pBR322 based plasmids, pSKF, pET23D, and commercially availabletag-fusion expression systems such as GST and LacZ.

Expression vectors containing regulatory elements from eukaryoticviruses are often used in eukaryotic expression vectors, e.g., SV40vectors, papilloma virus vectors, and vectors derived from Epstein-Barrvirus. Other exemplary eukaryotic vectors include pMSQ pAV009/A+,pMT010/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowingexpression of proteins under the direction of the SV40 early promoter,SV40 late promoter, metallothionein promoter, murine mammary tumor viruspromoter, Rous sarcoma virus promoter, polyhedrin promoter, or otherpromoters shown effective for expression in eukaryotic cells.

The vectors for expressing the guide RNAs can include RNA Pol IIIpromoters to drive expression of the guide RNAs, e.g., the H1, U6 or 7SKpromoters. These human promoters allow for expression of gRNAs inmammalian cells following plasmid transfection. Alternatively, a T7promoter may be used, e.g., for in vitro transcription, and the RNA canbe transcribed in vitro and purified. Vectors suitable for theexpression of short RNAs, e.g., siRNAs, shRNAs, or other small RNAs, canbe used.

Some expression systems have markers for selection of stably transfectedcell lines such as thymidine kinase, hygromycin B phosphotransferase,and dihydrofolate reductase. High yield expression systems are alsosuitable, such as using a baculovirus vector in insect cells, with thegRNA encoding sequence under the direction of the polyhedrin promoter orother strong baculovirus promoters.

The elements that are typically included in expression vectors alsoinclude a replicon that functions in E. coli, a gene encoding antibioticresistance to permit selection of bacteria that harbor recombinantplasmids, and unique restriction sites in nonessential regions of theplasmid to allow insertion of recombinant sequences.

Standard transfection methods are used to produce bacterial, mammalian,yeast or insect cell lines that express large quantities of protein,which are then purified using standard techniques (see, e.g., Colley etal., 1989, J. Biol. Chem., 264:17619-22; Guide to Protein Purification,in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)).Transformation of eukaryotic and prokaryotic cells are performedaccording to standard techniques (see, e.g., Morrison, 1977, J.Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology101:347-362 (Wu et al., eds, 1983).

Any of the known procedures for introducing foreign nucleotide sequencesinto host cells may be used. These include the use of calcium phosphatetransfection, polybrene, protoplast fusion, electroporation,nucleofection, liposomes, microinjection, naked DNA, plasmid vectors,viral vectors, both episomal and integrative, and any of the otherwell-known methods for introducing cloned genomic DNA, cDNA, syntheticDNA or other foreign genetic material into a host cell (see, e.g.,Sambrook et al., supra). It is only necessary that the particulargenetic engineering procedure used be capable of successfullyintroducing at least one gene into the host cell capable of expressingthe gRNA.

The present invention includes the vectors and cells comprising thevectors.

EXAMPLES

The invention is further described in the following examples, which donot limit the scope of the invention described in the claims.

Example 1. Assessing Specificity of RNA-Guided Endonucleases

CRISPR RNA-guided nucleases (RGNs) have rapidly emerged as a facile andefficient platform for genome editing. This example describes the use ofa human cell-based reporter assay to characterize off-target cleavage ofCas9-based RGNs.

Materials and Methods

The following materials and methods were used in Example 1.

Construction of Guide RNAs

DNA oligonucleotides (Table A) harboring variable 20 nt sequences forCas9 targeting were annealed to generate short double-strand DNAfragments with 4 bp overhangs compatible with ligation intoBsmBI-digested plasmid pMLM3636. Cloning of these annealedoligonucleotides generates plasmids encoding a chimeric +103single-chain guide RNA with 20 variable 5′ nucleotides under expressionof a U6 promoter (Hwang et al., Nat Biotechnol 31, 227-229 (2013); Maliet al., Science 339, 823-826 (2013)). pMLM3636 and the expressionplasmid pJDS246 (encoding a codon optimized version of Cas9) used inthis study are both available through the non-profit plasmiddistribution service Addgene (addgene.org/crispr-cas).

TABLE A gRNA Target Sequence PositionOligos for genterating gRNA expression plasmid EGFP Target Site 1 20 1918 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G G G C AC G G G C A G C T T G C C G G ACACCGGGCACGGGCAGCTTGCCGGG  36.AAAACCCGGCAAGCTGCCCGTGCCCG 230. G G G C A C G G G C A G C T T G C C G cACACCGGGCACGGGCAGCTTGCCGCG  37. AAAACGCGGCAAGCTGCCCGTGCCCG 231. G G G CA C G G G C A G C T T G C C c G ACACCGGGCACGGGCAGCTTGCCCGG  38.AAAACCGGGCAAGCTGCCCGTGCCCG 232. G G G C A C G G G C A G C T T G C g G GACACCGGGCACGGGCAGCTTGCGGGG  39. AAAACCCCGCAAGCTGCCCGTGCCCG 233. G G G CA C G G G C A G C T T G g C G G ACACCGGGCACGGGCAGCTTGGCGGG  40.AAAACCCGCCAAGCTGCCCGTGCCCG 234. G G G C A C G G G C A G C T T c C C G GACACCGGGCACGGGCAGCTTCCCGGG  41. AAAACCCGGGAAGCTGCCCGTGCCCG 235. G G G CA C G G G C A G C T a G C C G G ACACCGGGCACGGGCAGCTAGCCGGG  42.AAAACCCGGCTAGCTGCCCGTGCCCG 236. G G G C A C G G G C A G C a T G C C G GACACCGGGCACGGGCAGCATGCCGGG  43. AAAACCCGGCATGCTGCCCGTGCCCG 237. G G G CA C G G G C A G g T T G C C G G ACACCGGGCACGGGCAGGTTGCCGGG  44.AAAACCCGGCAACCTGCCCGTGCCCG 238. G G G C A C G G G C A c C T T G C C G GACACCGGGCACGGGCACCTTGCCGGG  45. AAAACCCGGCAAGGTGCCCGTGCCCG 239. G G G CA C G G G C t G C T T G C C G G ACACCGGGCACGGGCTGCTTGCCGGG  46.AAAACCCGGCAAGCAGCCCGTGCCCG 240. G G G C A C G G G g A G C T T G C C G GACACCGGGCACGGGGAGCTTGCCGGG  47. AAAACCCGGCAAGCTCCCCGTGCCCG 241. G G G CA C G G c C A G C T T G C C G G ACACCGGGCACGGCCAGCTTGCCGGG  48.AAAACCCGGCAAGCTGGCCGTGCCCG 242. G G G C A C G c G C A G C T T G C C G GACACCGGGCACGCGCAGCTTGCCGGG  49. AAAACCCGGCAAGCTGCGCGTGCCCG 243. G G G CA C c G G C A G C T T G C C G G ACACCGGGCACCGGCAGCTTGCCGGG  50.AAAACCCGGCAAGCTGCCGGTGCCCG 244. G G G C A g G G G C A G C T T G C C G GACACCGGGCAGGGGCAGCTTGCCGGG  51. AAAACCCGGCAAGCTGCCCCTGCCCG 245. G G G Ct C G G G C A G C T T G C C G G ACACCGGGCTCGGGCAGCTTGCCGGG  52.AAAACCCGGCAAGCTGCCCGAGCCCG 246. G G G g A C G G G C A G C T T G C C G GACACCGGGGACGGGCAGCTTGCCGGG  53. AAAACCCGGCAAGCTGCCCGTCCCCG 247. G G c CA C G G G C A G C T T G C C G G ACACCGGCCACGGGCAGCTTGCCGGG  54.AAAACCCGGCAAGCTGCCCGTGGCCG 248. G c G C A C G G G C A G C T T G C C G GACACCGCGCACGGGCAGCTTGCCGGG  55. AAAACCCGGCAAGCTGCCCGTGCGCG 249. G G G CA C G G G C A G C T T G C C G G ACACCGGGCACGGGCAGCTTGCCCCG  56.AAAACGGGGCAAGCTGCCCGTGCCCG 250. G G G C A C G G G C A G C T T G g g G GACACCGGGCACGGGCAGCTTGGGGGG  57. AAAACCCCCCAAGCTGCCCGTGCCCG 251. G G G CA C G G G C A G C T a c C C G G ACACCGGGCACGGGCAGCTACCCGGG  58.AAAACCCGGGTAGCTGCCCGTGCCCG 252. G G G C A C G G G C A G g a T G C C G GACACCGGGCACGGGCAGGATGCCGGG  59. AAAACCCGGCATCCTGCCCGTGCCCG 253. G G G CA C G G G C t c C T T G C C G G ACACCGGGCACGGGCTCCTTGCCGGG  60.AAAACCCGGCAAGGAGCCCGTGCCCG 254. G G G C A C G G c g A G C T T G C C G GACACCGGGCACGGCGAGCTTGCCGGG  61. AAAACCCGGCAAGCTCGCCGTGCCCG 255. G G G CA C c c G C A G C T T G C C G G ACACCGGGCACCCGCAGCTTGCCGGG  62.AAAACCCGGCAAGCTGCGGGTGCCCG 256. G G G C t g G G G C A G C T T G C C G GACACCGGGCTGGGGCAGCTTGCCGGG  63. AAAACCCGGCAAGCTGCCCCAGCCCG 257. G G c gA C G G G C A G C T T G C C G G ACACCGGCGACGGGCAGCTTGCCGGG  64.AAAACCCGGCAAGCTGCCCGTCGCCG 258. G c c C A C G G G C A G C T T G C C G GACACCGCCCACGGGCAGCTTGCCGGG  65. AAAACCCGGCAAGCTGCCCGTGCGGG 259. G c C gA C G G G C A G C T T G C C G G ACACCGCCGACGGGCAGCTTGCCGGG  66.AAAACCCGGCAAGCTGCCCGTGCCCG 260. G c c g t C G G G C A G C T T G C C G GACACCGCCGTCGGGCAGCTTGCCGGG  67. AAAACCCGGCAAGCTGCCCGTGCCCG 261. G c c gt g G G G C A G C T T G C C G G ACACCGCCGTGGGGCAGCTTGCCGGG  68.AAAACCCGGCAAGCTGCCCGTGCCCG 262. G c c g t g c G G C A G C T T G C C G GACACCGCCGTGCGGCAGCTTGCCGGG  69. AAAACCCGGCAAGCTGCCCGTGCCCG 263. G c c gt g c c G C A G C T T G C C G G ACACCGCCGTGCCGCAGCTTGCCGGG  70.AAAACCCGGCAAGCTGCCCGTGCCCG 264. G c c g t g c c c C A G C T T G C C G GACACCGCCGTGCCCCAGCTTGCCGGG  71. AAAACCCGGCAAGCTGCCCGTGCCCG 265. G c c gt g c c c g A G C T T G C C G G ACACCGCCGTGCCCGAGCTTGCCGGG  72.AAAACCCGGCAAGCTGCCCGTGCCCG 266. G G G C A C G G G C A G C T T G C g G cACACCGGGCACGGGCAGCTTGCGGCG  73. AAAACGCCGCAAGCTGCCCGTGCCCG 267. G G G CA C G G G C A G C T T c C g G G ACACCGGGCACGGGCAGCTTCCGGGG  74.AAAACCCCGGAAGCTGCCCGTGCCCG 268. G G G C A C G G G C a t g c g G C C G GACACCGGGCACGGGCAGCATGCGGGG  75. AAAACCCCGCATGCTGCCCGTGCCCG 269. G G G CA C G G G C A G C T T G C C G G ACACCGGGCACGGGCACCTTGCGGGG  76.AAAACCCCGCAAGGTGCCCGTGCCCG 270. G G G C A C G G G g A G C T T G C g G GACACCGGGCACGGGGAGCTTGCGGGG  77. AAAACCCCGCAAGCTCCCCGTGCCCG 271. G G G CA C G c G C A G C T T G C g G G ACACCGGGCACGCGCAGCTTGCGGGG  78.AAAACCCCGCAAGCTGCGCGTGCCCG 272. G G G C A g G G G C A G C T T G C g G GACACCGGGCAGGGGCAGCTTGCGGGG  79. AAAACCCCGCAAGCTGCCCCTGCCCG 273. G G G gA C G G G C A G C T T G C C G G ACACCGGGGACGGGCAGCTTGCGGGG  80.AAAACCCCGCAAGCTGCCCGTCCCCG 274. G c G C A C G G G C A G C T T G C g G GACACCGCGCACGGGCAGCTTGCGGGG  81. AAAACCCCGCAAGCTGCCCGTGCGCG 275. G G G CA C G G G g A G C T T G C C G c ACACCGGGCACGGGGAGCTTGCCGCG  82.AAAACGCGGCAAGCTCCCCGTGCCCG 276. G G G C A C G G G g A G C T T c C C G GACACCGGGCACGGGGAGCTTCCCGGG  83. AAAACCCGGGAAGCTCCCCGTGCCCG 277. G G G CA C G G G g A G C a T G C C G G ACACCGGGCACGGGGAGCATGCCGGG  84.AAAACCCGGCATGCTCCCCGTGCCCG 278. G G G C A C G G G g A c C T T G C C G GACACCGGGCACGGGGACCTTGCCGGG  85. AAAACCCGGCAAGGTCCCCGTGCCCG 279. G G G CA C G c G g A G C T T G C C G G ACACCGGGCACGCGGAGCTTGCCGGG  86.AAAACCCGGCAAGCTCCGCGTGCCCG 280. G G G C A g G G G g A G C T T G C C G GACACCGGGCAGGGGGAGCTTGCCGGG  87. AAAACCCGGCAAGCTCCCCCTGCCCG 281. G G G gA C G G G g A G C T T G C C G G ACACCGGGGACGGGGAGCTTGCCGGG  88.AAAACCCGGCAAGCTCCCCGTCCCCG 282. G c G C A C G G G g A G C T T G C C G GACACCGCGCACGGGGAGCTTGCCGGG  89. AAAACCCGGCAAGCTCCCCGTGCGCG 283. G c G CA C G G G C A G C T T G C C G c ACACCGCGCACGGGCAGCTTGCCGCG  90.AAAACGCGGCAAGCTGCCCGTGCGCG 284. G c G C A C G G G C A G C T T c C C G GACACCGCGCAGGGGGAGGTTCCCGGG  91. AAAACCCGGGAAGCTGCCCGTGCGCG 285. G c G CA C G G G C A G C a T G C C G G ACACCGCGCAGGGGGAGGATGCCGGG  92.AAAACCCGGCATGCTGCCCGTGCGCG 286. G c G C A C G G G C A c C T T G C C G GACACCGCGCACGGGCACCTTGCCGGG  93. AAAACCCGGCAAGGTGCCCGTGCGCG 287. G c G CA C G c G C A G C T T G C C G G ACACCGCGCACGCGCAGCTTGCCGGG  94.AAAACCCGGCAAGCTGCGCGTGCGCG 288. G c G C A g G G G C A G C T T G C C G GACACCGCGCAGGGGGAGGTTGCCGGG  95. AAAACCCGGCAAGCTGCCCCTGCGCG 289. G c G gA C G G G C A G C T T G C C G G ACACCGCGGACGGGCAGCTTGCCGGG  96.AAAACCCGGCAAGCTGCCCGTCCGCG 290. EGFP TARGET SITE 2 20 19 18 17 16 15 1413 12 11 10 9 8 7 6 5 4 3 2 1 oligonucleotide 1 (5′ to 3′) #oligonucleotide 2 (5′ to 3′) # G A T G C C G T T C T T C T G C T T G TACACCGATGCCGTTGTTGTGGTTGTG  97. AAAACACAAGCAGAAGAACGGCATCG 291. G A T GC C G T T C T T C T G C T T G a ACACCGATGCCGTTCTTCTGCTTGAG  98.AAAACACAAGCAGAAGAACGGCATCG 292. G A T G C C G T T C T T C T G C T T c TACACCGATGCCGTTCTTCTGCTTCTG  99. AAAACACAAGCAGAAGAACGGCATCG 293. G A T GC C G T T C T T C T G C T a G T ACACCGATGCCGTTGTTGTGGTAGTG 100AAAACACAAGCAGAAGAACGGCATCG 294. G A T G C C G T T C T T C T G C a T G TACACCGATGCCGTTCTTCTGCATGTG 101 AAAACACAAGCAGAAGAACGGCATCG 295. G A T G CC G T T C T T C T G g T T G T ACACCGATGCCGTTCTTCTGGTTGTG 102AAAACACAAGCAGAAGAACGGCATCG 296. G A T G C C G T T C T T C T c C T T G TACACCGATGCCGTTCTTCTCCTTGTG 103 AAAACACAAGCAGAAGAACGGCATCG 297. G A T G CC G T T C T T C a G C T T G T ACACCGATGCCGTTGTTGAGGTTGTG 104AAAACACAAGCAGAAGAACGGCATCG 298. G A T G C C G T T C T T g T G C T T G TACACCGATGCCGTTCTTGTGCTTGTG 105 AAAACACAAGCAGAAGAACGGCATCG 299. G A T G CC G T T C T a C T G C T T G T ACACCGATGCCGTTGTAGTGGTTGTG 106AAAACACAAGCAGAAGAACGGCATCG 300. G A T G C C G T T C a T C T G C T T G TACACCGATGCCGTTCATCTGCTTGTG 107 AAAACACAAGCAGAAGAACGGCATCG 301. G A T G CC G T T g T T C T G C T T G T ACACCGATGCCGTTGTTCTGCTTGTG 108AAAACACAAGCAGAAGAACGGCATCG 302. G A T G C C G T a C T T C T G C T T G TACACCGATGCCGTACTTCTGCTTGTG 109 AAAACACAAGCAGAAGAACGGCATCG 303. G A T G CC G a T C T T C T G C T T G T ACACCGATGCCGATCTTCTGCTTGTG 110AAAACACAAGCAGAAGAACGGCATCG 304. G A T G C C c T T C T T C T G C T T G TACACCGATGCCCTTCTTCTGCTTGTG 111 AAAACACAAGCAGAAGAACGGCATCG 305. G A T G Cg G T T C T T C T G C T T G T ACACCGATGCGGTTCTTCTGCTTGTG 112AAAACACAAGCAGAAGAACGGCATCG 306. G A T G g C G T T C T T C T G C T T G TACACCGATGGCGTTCTTCTGCTTGTG 113 AAAACACAAGCAGAAGAACGGCATCG 307. G A T c CC G T T C T T C T G C T T G T ACACCGATCCCGTTCTTGTGGTTGTG 114AAAACACAAGCAGAAGAACGGCATCG 308. G A t G C C G T T C T T C T G C T T G TACACCGAAGCCGTTCTTGTGGTTGTG 115 AAAACACAAGCAGAAGAACGGCATCG 309. G a T G CC G T T C T T C T G C T T G T ACACCGTTGCCGTTCTTGTGGTTGTG 116AAAACACAAGCAGAAGAACGGCATCG 310. G A T G C C G T T C T T C T G C T T c aACACCGATGCCGTTCTTCTGCTTCAG 117 AAAACTGAAGCAGAAGAACGGCATCG 311. G A T G CC G T T C T T C T G C a a G T ACACCGATGCCGTTCTTGTGGAAGTG 118AAAACACAAGCAGAAGAACGGCATCG 312. G A T G C C G T T C T T C T c g T T G TACACCGATGCCGTTCTTCTCGTTGTG 119 AAAACACAAGCAGAAGAACGGCATCG 313. G A T G CC G T T C T T g a G C T T G T ACACCGATGCCGTTCTTGAGGTTGTG 120AAAACACAAGCTCAAGAACGGCATCG 314. G A T G C C G T T C a a C T G C T T G TACACCGATGCCGTTCAAGTGGTTGTG 121 AAAACACAAGCAGAAGAACGGCATCG 315. G A T G CC G T a g T T C T G C T T G T ACACCGATGCCGTACTTGTGGTTGTG 122AAAACACAAGCAGAAGAACGGCATCG 316. G A T G C C c a T C T T C T G C T T G TACACCGATGCCCATCTTGTGGTTGTG 123 AAAACACAAGCAGAAGAACGGCATCG 317. G A T G gg G T T C T T C T G C T T G T ACACCGATGGGGTTCTTCTGCTTGTG 124AAAACACAAGCAGAAGAACGGCATCG 318. G A a c C C G T T C T T C T G C T T G TACACCGAACCCGTTCTTCTGCTTGTG 125 AAAACACAAGCAGAAGAACGGCATCG 319. G t a G CC G T T C T T C T G C T T G T ACACCGTAGCCCTTGTTGTGGTTGTG 126AAAACACAAGCAGAAGAACGGCAAGG 320. G t a c C C G T T C T T C T G C T T G TACACCGTACCCCTTGTTGTGGTTGTG 127 AAAACACAAGCAGAAGAACGGGTACG 321. G t a c gC G T T C T T C T G C T T G T ACACCGTAGGCCTTGTTGTGGTTGTG 128AAAACACAAGCAGAAGAACGCGTACG 322. G t a c g g G T T C T T C T G C T T G TACACCGTACGGGTTGTTGTGGTTGTG 129 AAAACACAAGCAGAAGAACCCGTACG 323. G t a c gg c T T C T T C T G C T T G T ACACCGTACGGCTTCTTCTGCTTGTG 130AAAACACAAGCAGAAGAAGCCGTACG 324. G t a c g g c a T C T T C T G C T T G TACACCGTACGGCATCTTCTGCTTGTG 131 AAAACACAAGCAGAAGATGCCGTACG 325. G t a c gg c a a C T T C T G C T T G T ACACCGTACGGCAACTTCTGCTTGTG 132AAAACACAAGCAGAAGTTGCCGTACG 326. G t a c g g c a a g T T C T G C T T G TACACCGTACGGCAAGTTCTGCTTGTG 133 AAAACACAAGCAGAACTTGCCGTACG 327. G A T G CC G T T C T T C T G C T a G a ACACCGATGCCGTTCTTCTGCTAGAG 134AAAACTCTAGCAGAAGAACGGCATCG 328. G A T G C C G T T C T T C T G g T a G TACACCGATGCCGTTCTTCTGGTAGTG 135 AAAACACTACCAGAAGAACGGCATCG 329. G A T G CC G T T C T T C a G C T a G T ACACCGATGCCGTTCTTCAGCTAGTG 136AAAACACTAGCTGAAGAACGGCATCG 330. G A T G C C G T T C T a C T G C T a G TACACCGATGCCGTTCTACTGCTAGTG 137 AAAACACTAGCAGTAGAACGGCATCG 331. G A T G CC G T T g T T C T G C T a G T ACACCGATGCCGTTGTTCTGCTAGTG 138AAAACACTAGCAGAACAACGGCATCG 332. G A T G C C G a T C T T C T G C T a G TACACCGATGCCGATCTTCTGCTAGTG 139 AAAACACTAGCAGAAGATCGGCATCG 333. G A T G Cg G T T C T T C T G C T a G T ACACCGATGCGGTTCTTCTGCTAGTG 140AAAACACTAGCAGAAGAACCGCATCG 334. G A T c C C G T T C T T C T G C T a G TACACCGATCCCGTTCTTCTGCTAGTG 141 AAAACACTAGCAGAAGAACGGGATCG 335. G t T G CC G T T C T T C T G C T a G T ACACCGTTGCCGTTCTTCTGCTAGTG 142AAAACACTAGCAGAAGAACGGCAACG 336. G A T G C C G T T g T T C T G C T T G aACACCGATGCCGTTGTTCTGCTTGAG 143 AAAACTCAAGCAGAACAACGGCATCG 337. G A T G CC G T T g T T C T G g T T G T ACACCGATGCCGTTGTTCTGGTTGTG 144AAAACACAACCAGAACAACGGCATCG 338. G A T G C C G T T g T T C a G C T T G TACACCGATGCCGTTGTTCAGCTTGTG 145 AAAACACAAGCTGAACAACGGCATCG 339. G A T G CC G T T g T a C T G C T T G T ACACCGATGCCGTTGTACTGCTTGTG 146AAAACACAAGCAGTACAACGGCATCG 340. G A T G C C G a T g T T C T G C T T G TACACCGATGCCGATGTTCTGCTTGTG 147 AAAACACAAGCAGAACATCGGCATCG 341. G A T G Cg G T T g T T C T G C T T G T ACACCGATGCGGTTGTTCTGCTTGTG 148AAAACACAAGCAGAACAACCGCATCG 342. G A T c C C G T T g T T C T G C T T G TACACCGATCCCGTTGTTCTGCTTGTG 149 AAAACACAAGCAGAACAACGGGATCG 343. G t T G CC G T T g T T C T G C T T G T ACACCGTTGCCGTTGTTCTGCTTGTG 150AAAACACAAGCAGAACAACGGCAACG 344. G t T G C C G T T C T T C T G C T T G aACACCGTTGCCGTTCTTCTGCTTGAG 151 AAAACTCAAGCAGAAGAACGGCAACG 345. G t T G CC G T T C T T C T G g T T G T ACACCGTTGCCGTTCTTCTGGTTGTG 152AAAACACAACCAGAAGAACGGCAACG 346. G t T G C C G T T C T T C a G C T T G TACACCGTTGCCGTTGTTCAGCTTGTG 153 AAAACACAAGCTGAAGAACGGCAACG 347. G t T G CC G T T C T a C T G C T T G T ACACCGTTGCCGTTGTACTGCTTGTG 154AAAACACAAGCAGTAGAACGGCAACG 348. G t T G C C G a T C T T C T G C T T G TACACCGTTGCCGATCTTCTGCTTGTG 155 AAAACACAAGCAGAAGATCGGCAACG 349. G t T G Cg G T T C T T C T G C T T G T ACACCGTTGCGGTTCTTCTGCTTGTG 156AAAACACAAGCAGAAGAACCGCAACG 350. G t T c C C G T T C T T C T G C T T G TACACCGTTCCCGTTCTTCTGCTTGTG 157 AAAACACAAGCAGAAGAACGGGAACG 351.EGFP TARGET SITE 3 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G G T G GT G C A G A T G A A C T T C A ACACCGGTGGTGGAGATGAAGTTGAG 158AAAACTGAAGTTCATCTGCACCACCG 352. G G T G G T G C A G A T G A A C T T C gACACCGGTGGTGCAGATGAACTTCTG 159 AAAACAGAAGTTCATCTGCACCACCG 353. G G T G GT G C A G A T G A A C T T g A ACACCGGTGGTGCAGATGAACTTGAG 160AAAACTCAAGTTCATCTGCACCACCG 354. G G T G G T G C A G A T G A A C T a C AACACCGGTGGTGCAGATGAACTACAG 161 AAAACTGTAGTTCATCTGCACCACCG 355. G G T G GT G C A G A T G A A C a T C A ACACCGGTGGTGCAGATGAACATCAG 162AAAACTGATGTTCATCTGCACCACCG 356. G G T G G T G C A G A T G A A g T T C AACACCGGTGGTGGAGATGAAGTTGAG 163 AAAACTGAACTTCATCTGCACCACCG 357. G G T G GT G C A G A T G A t C T T C A ACACCGGTGGTGGAGATGATGTTGAG 164AAAACTGAAGATCATCTGCACCACCG 358. G G T G G T G C A G A T G t A C T T C AACACCGGTGGTGGAGATGTAGTTGAG 165 AAAACTGAAGTACATCTGCACCACCG 359. G G T G GT G C A G A T c A A C T T C A ACACCGGTGGTGGAGATGAAGTTGAG 166AAAACTGAAGTTGATCTGCACCACCG 360. G G T G G T G C A G A a G A A C T T C AACACCGGTGGTGGAGAAGAAGTTGAG 167 AAAACTGAAGTTCTTCTGCACCACCG 361. G G T G GT G C A G t T G A A C T T C A ACACCGGTGGTGGAGTTGAAGTTGAG 168AAAACTGAAGTTCAACTGCACCACCG 362. G G T G G T G C A c A T G A A C T T C AACACCGGTGGTGGAGATGAAGTTGAG 169 AAAACTGAAGTTCATGTGCACCACCG 363. G G T G GT G C t G A T G A A C T T C A ACACCGGTGGTGGTGATGAAGTTGAG 170AAAACTGAAGTTCATCAGCACCACCG 364. G G T G G T G g A G A T G A A C T T C AACACCGGTGGTGGAGATGAAGTTGAG 171 AAAACTGAAGTTCATCTCCACCACCG 365. G G T G GT c C A G A T G A A C T T C A ACACCGGTGGTCCAGATGAACTTCAG 172AAAACTGAAGTTCATCTGGACCACCG 366. G G T G G a G C A G A T G A A C T T C AACACCGGTGGAGGAGATGAAGTTGAG 173 AAAACTGAAGTTCATCTGCTCCACCG 367. G G T G cT G C A G A T G A A C T T C A ACACCGGTGGTGGAGATGAAGTTGAG 174AAAACTGAAGTTCATCTGCAGCACCG 368. G G T c G T G C A G A T G A A C T T C AACACCGGTGGTGGAGATGAAGTTGAG 175 AAAACTGAAGTTCATCTGCACGACCG 369. G G a G GT G C A G A T G A A C T T C A ACACCGGAGGTGGAGATGAAGTTGAG 176AAAACTGAAGTTCATCTGCACCTCCG 370. G c T G G T G C A G A T G A A C T T C tACACCGCTGGTGGAGATGAAGTTGAG 177 AAAACTGAAGTTCATCTGCACCAGCG 371. G G T G GT G C A G A T G A A C T T g A ACACCGGTGGTGCAGATGAACTTGTG 178AAAACACAAGTTCATCTGCACCACCG 372. G G T G G T G C A G A T G A A C a a C AACACCGGTGGTGGAGATGAAGAAGAG 179 AAAACTGTTGTTCATCTGCACCACCG 373. G G T G GT G C A G A T G A t g T T C A ACACCGGTGGTGGAGATGATGTTGAG 180AAAACTGAACATCATCTGCACCACCG 374. G G T G G T G C A G A T c t A C T T C AACACCGGTGGTGCAGATCTACTTCAG 181 AAAACTGAAGTAGATCTGCACCACCG 375. G G T G GT G C A G t a G A A C T T C A ACACCGGTGGTGGAGTAGAAGTTGAG 182AAAACTGAAGTTCTACTGCACCACCG 376. G G T G G T G C t c A T G A A C T T C AACACCGGTGGTGGTGATGAAGTTGAG 183 AAAACTGAAGTTCATGAGCACCACCG 377. G G T G GT c g A G A T G A A C T T C A ACACCGGTGGTGGAGATGAAGTTGAG 184AAAACTGAAGTTCATCTCGACCACCG 378. G G T G c a G C A G A T G A A C T T C AACACCGGTGGAGGAGATGAAGTTGAG 185 AAAACTGAAGTTCATCTGCTGCACCG 379. G G a c GT G C A G A T G A A C T T C A ACACCGGAGGTGGAGATGAAGTTGAG 186AAAACTGAAGTTCATCTGCACGTCCG 380. G c a G G T G C A G A T G A A C T T C AACACCGCAGGTGGAGATGAAGTTGAG 187 AAAACTGAAGTTCATCTGCACCAGGG 381. G c a c GT G C A G A T G A A C T T C A ACACCGCACGTGGAGATGAAGTTGAG 188AAAACTGAAGTTCATCTGCACGTGCG 382. G c a c G T G C A G A T G A A C T T C AACACCGCACCTGGAGATGAAGTTGAG 189 AAAACTGAAGTTCATCTGCAGGTGCG 383. G c a c GT G C A G A T G A A C T T C A ACACCGCACCAGGAGATGAAGTTGAG 190AAAACTGAAGTTCATCTGCTGGTGCG 384. G c a c G T G C A G A T G A A C T T C AACACCGCACCACCAGATGAAGTTGAG 191 AAAACTGAAGTTCATCTGGTGGTGCG 385. G c a c GT G C A G A T G A A C T T C A ACACCGCACCAGGAGATGAAGTTGAG 192AAAACTGAAGTTCATCTCGTGGTGCG 386. G c a c G T G C A G A T G A A C T T C AACACCGCACCACGTGATGAAGTTGAG 193 AAAACTGAAGTTCATCACGTGGTGCG 387. G c a c GT G C A G A T G A A C T T C A ACACCGCACCACGTGATGAAGTTGAG 194AAAACTGAAGTTCATGACGTGGTGCG 388. G G T G G T G C A G A T G A A C T a C tACACCGGTGGTGCAGATGAACTACTG 195 AAAACAGTAGTTCATCTGCACCACCG 389. G G T G GT G C A G A T G A A g T a C A ACACCGGTGGTGCAGATGAAGTACAG 196AAAACTGTACTTCATCTGCACCACCG 390. G G T G G T G C A G A T G t A C T a C AACACCGGTGGTGCAGATGTACTACAG 197 AAAACTGTAGTACATCTGCACCACCG 391. G G T G GT G C A G A a G A A C T a C A ACACCGGTGGTGCAGAAGAACTACAG 198AAAACTGTAGTTCTTCTGCACCACCG 392. G G T G G T G C A c A T G A A C T a C AACACCGGTGGTGCACATGAACTACAG 199 AAAACTGTAGTTCATGTGCACCACCG 393. G G T G GT G g A G A T G A A C T a C A ACACCGGTGGTGGAGATGAACTACAG 200AAAACTGTAGTTCATCTCCACCACCG 394. G G T G G a G C A G A T G A A C T a C AACACCGGTGGAGCAGATGAACTACAG 201 AAAACTGTAGTTCATCTGCTCCACCG 395. G G T c GT G C A G A T G A A C T a C A ACACCGGTCGTGCAGATGAACTACAG 202AAAACTGTAGTTCATCTGCACGACCG 396. G c T G G T G C A G A T G A A C T a C AACACCGCTGGTGCAGATGAACTACAG 203 AAAACTGTAGTTCATCTGCACCAGCG 397. G G T G GT G C A c A T G A A C T T C t ACACCGGTGGTGCACATGAACTTCTG 204AAAACAGAAGTTCATGTGCACCACCG 398. G G T G G T G C A c A T G A A g T T C AACACCGGTGGTGCACATGAAGTTCAG 205 AAAACTGAACTTCATGTGCACCACCG 399. G G T G GT G C A c A T G t A C T T C A AGAGGGGTGGTGGAGATGTAGTTGAG 206AAAACTGAAGTACATGTGCACCACCG 400. G G T G G T G C A c A a G A A C T T C AACACCGGTGGTGCACAAGAACTTCAG 207 AAAACTGAAGTTCTTGTGCACCACCG 401. G G T G GT G g A c A T G A A C T T C A ACACCGGTGGTGGACATGAACTTCAG 208AAAACTGAAGTTCATGTCCACCACCG 402. G G T G G a G C A c A T G A A C T T C AACACCGGTGGAGCACATGAACTTCAG 209 AAAACTGAAGTTCATGTGCTCCACCG 403. G G T c GT G C A c A T G A A C T T C A ACACCGGTCGTGCACATGAACTTCAG 210AAAACTGAAGTTCATGTGCACGACCG 404. G c T G G T G C A c A T G A A C T T C AACACCGCTGGTGCACATGAACTTCAG 211 AAAACTGAAGTTCATGTGCACCAGCG 405. G c T G GT G C A G A T G A A C T T C t ACACCGCTGGTGCAGATGAACTTCTG 212AAAACAGAAGTTCATCTGCACCAGCG 406. G c T G G T G C A G A T G A A C T T C AACACCGCTGGTGCAGATGAAGTTCAG 213 AAAACTGAACTTCATCTGCACCAGCG 407. G c T G GT G C A G A T G A A C T T C A ACACCGCTGGTGGAGATGTAGTTGAG 214AAAACTGAAGTACATCTGCACCAGCG 408. G c T G G T G C A G A T G A A C T T C AAGACCGCTGGTGGAGAAGAAGTTGAG 215 AAAACTGAAGTTCTTCTGCACCAGCG 409. G c T G GT G g A G A T G A A C T T C A ACACCGCTGGTGGAGATGAAGTTGAG 216AAAACTGAAGTTCATCTCCACCAGCG 410. G c T G G a G C A G A T G A A C T T C AACACCGCTGGAGGAGATGAAGTTGAG 217 AAAACTGAAGTTCATCTGCTCCAGCG 411. G c T c GT G C A G A T G A A C T T C A ACACCGCTCGTGGAGATGAAGTTGAG 218AAAACTGAAGTTCATCTGCACGAGCG 412. Endogenous Target 1 (VEGFA Site 1) 20 1918 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G G G T GG G G G G A G T T T G C T C C ACACCGGGTGGGGGGAGTTTGCTCCG 219AAAACGGAGCAAACTCCCCCCCACCG 413. 220 414.Endogenous Target 2 (VEGFA Site 2): 20 19 18 17 16 15 14 13 12 11 10 9 87 6 5 4 3 2 1 oligonucleotide 1 (5′ to 3′) #oligonucleotide 2 (5′ to 3′) # G A C C C C C T C C A C C C C G C C T CACACCGACCCCCTCCACCCCGCCTCG 221 AAAACGAGGCGGGGTGGAGGGGGTCG 415. 222 416.Endogenous Target 3 (VEGFA Site 3): 20 19 18 17 16 15 14 13 12 11 10 9 87 6 5 4 3 2 1 oligonucleotide 1 (5′ to 3′) #oligonucleotide 2 (5′ to 3′) # G G T G A G T G A G T G T G T G C G T GACACCGGTGAGTGAGTGTGTGCGTGG 223 AAAACCACGCACACACTCACTCACCG 417. 224 418.Endogenous Target 4 (EMX1): 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 43 2 1 oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G AG T C C G A G C A G A A G A A G A A ACACCGAGTCCGAGCAGAAGAAGAAG 225AAAACTTCTTCTTCTGCTCGGACTCG 419. 226 420. Endogenous Target 5 (RNF2): 2019 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G T C A TC T T A G T C A T T A C C T G ACACCGTCATCTTAGTCATTACCTGG 227AAAACCAGGTAATGACTAAGATGACG 421. 228 422. Endogenous Target 6 (FANCF): 2019 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1oligonucleotide 1 (5′ to 3′) # oligonucleotide 2 (5′ to 3′) # G G A A TC C C T T C T G C A G C A C C ACACCGGAATCCCTTCTGCAGCACCG 229AAAACGGTGCTGCAGAAGGGATTCCG 423. Sequences of oligonucleotides used togenerate expression plasmids encoding single gRNAs/variant single gRNAstargeted to sites in the EGFP reporter gene and single gRNAs targeted tosix endogenous human gene targets. #, SEQ ID NO:.

EGFP Activity Assays

U2OS.EGFP cells harboring a single integrated copy of an EGFP-PESTfusion gene were cultured as previously described (Reyon et al., NatBiotech 30, 460-465 (2012)). For transfections, 200,000 cells wereNucleofected with the indicated amounts of sgRNA expression plasmid andpJDS246 together with 30 ng of a Td-tomato-encoding plasmid using the SECell Line 4D-Nucleofector™ X Kit (Lonza) according to the manufacturer'sprotocol. Cells were analyzed 2 days post-transfection using a BD LSRIIflow cytometer. Transfections for optimizing gRNA/Cas9 plasmidconcentration were performed in triplicate and all other transfectionswere performed in duplicate.

PCR Amplification and Sequence Verification of Endogenous Human GenomicSites

PCR reactions were performed using Phusion Hot Start II high-fidelityDNA polymerase (NEB) with PCR primers and conditions listed in Table B.Most loci amplified successfully using touchdown PCR (98° C., 10 s;72-62° C., −1° C./cycle, 15 s; 72° C., 30 s] 10 cycles, [98° C., 10 s;62° C., 15 s; 72° C., 30 s] 25 cycles). PCR for the remaining targetswere performed with 35 cycles at a constant annealing temperature of 68°C. or 72° C. and 3% DMSO or 1M betaine, if necessary. PCR products wereanalyzed on a QIAXCEL capillary electrophoresis system to verify bothsize and purity. Validated products were treated with ExoSap-IT(Affymetrix) and sequenced by the Sanger method (MGH DNA SequencingCore) to verify each target site.

TABLE B non- Watson- Watson- Crick Actual Target in SEQ ID SEQ ID PCRCrick Trans- Trans- Tran- U2OS.EGFP cells Forward PCR Primer NOReverse PCR Primer NO: Conditions versions versions sitionsTCCAGATGGCACATTGTCAG 436. AGGGAGCAGGAAAGTGAGGT 748. DMSOGGGGCCCACTCTTCTTCCAT 437. ACCCAGACTCCTGGTGTGGC 749. No DMSO 0 0 1GCTAAGCAGAGATGCCTATGCC 438. ACCACCCTTTCCCCCAGAAA 750. DMSO 2 0 0ACCCCACAGCCAGGTTTTCA 439. GAATCACTGCACCTGGCCATC 751. DMSO 0 0 2TGCGGCAACTTCAGACAACC 440. TAAAGGGCGTGCTGGGAGAG 752. DMSO 1 1 0GCATGTCAGGATCTGACCCC 441. TGCAGGGCCATCTTGTGTGT 753. DMSO 0 2 0CCACCACATGTTCTGGGTGC 442. CTGGGTCTGTTCCCTGTGGG 754. DMSO 1 1 1GGCTCTCCCTGCCCTAGTTT 443. GCAGGTCAAGTTGGAACCCG 755. DMSO 0 2 1GGGGCTGAGAACACATGAGATGCA 444. AGATTTGTGCACTGCCTGCCT 756. DMSO 1 0 2CCCGACCTCCGCTCCAAAGC 445. GGACCTCTGCACACCCTGGC 757. DMSO 2 1 0TGCAAGGTCGCATAGTCCCA 446. CAGGAGGGGGAAGTGTGTCC 758. DMSO 1 1 1GCCCATTCTTTTTGCAGTGGA 447. GAGAGCAAGTTTGTTCCCCAGG 759. DMSO 0 1 2GCCCCCAGCCCCTCTGTTTC 448. GCTGCTGGTAGGGGAGCTGG 760. DMSO 1 2 0CGGCTGCCTTCCCTGAGTCC 449. GGGTGACGCTTGCCATGAGC 761. 72 C. Anneal, 1 2 03% DMSO TGACCCTGGAGTACAAAATGTTCCCA 450. GCTGAGACAACCAGCCCAGCT 762.72 C. Anneal, 2 1 0 3% DMSO TGCCTCCACCCTTAGCCCCT 451.GCAGCCGATCCACACTGGGG 763. DMSO 1 0 2 AACTCAGGACAACACTGCCTGT 452.CCCAGGAGCAGGGTACAATGC 764. DMSO 0 1 2 TCCTCCTTGGAGAGGGGCCC 453.CCTTGGAAGGGGCCTTGGTGG 765. DMSO 0 3 0 CCGAGGGCATGGGCAATCCT 454.GGCTGCTGCGAGTTGCCAAC 766. DMSO 0 1 3 TGCTTTGCATGGGGTCTCAGACA 455.GGGTTGCTTGCCCTCTGTGT 767. DMSO 0 2 2 AGCTCCTTCTCATTTCTCTTCTGCTGT 456.CACAGAAGGATGTGTGCAGGTT 768. DMSO 0 2 2 AGCAGACACAGGTGAATGCTGCT 457.GGTCAGGTGTGCTGCTAGGCA 769. DMSO 1 1 2 CCTGTGGGGCTCTCAGGTGC 458.ACTGCCTGCCAAAGTGGGTGT 770. No DMSO TD 1 1 2 AGCTGCACTGGGGAATGAGT 459.TGCCGGGTAATAGCTGGCTT 771. DMSO 0 1 3 CCAGCCTGGGCAACAAAGCG 460.GGGGGCTTCCAGGTCACAGG 772. 72 C. Anneal, 0 3 1 3% DMSO, 6% DMSOTACCCCCACTGCCCCATTGC 461. ACAGGTCCATGCTTAGCAGAGG 773. DMSO 0 1 3 GGGGTGATTGAAGTTTGCT ACGGATTCACGACGGAGGTGC 462. CCGAGTCCGTGGCAGAGAGC 774.DMSO 0/1 2 2 CCAGG GGGTGATTGAAGTTTGCT GCAGG (SEQ ID NO: 424)TGTGGTTGAAGTAGGGGACAGGT 463. TGGCCCAATTGGAAGTGATTTC 775. DMSO 3 1 0 GTTGGGATGGCAGAGTCATCAACGT 464. GGCCCAATCGGTAGAGGATGCA 776. DMSO 0 3 1ATGGGGCGCTCCAGTCTGTG 465. TGCACCCACACAGCCAGCAA 777. DMSO 0 3 1GGGGAGGGAGGACCAGGGAA 466. AATTAGCTGGGCGCGGTGGT 778. 72 C. Anneal, 0 1 33% DMSO ATCCCGTGCAGGAAGTCGCC 467. CAGGCGGCCCCTTGAGGAAT 779. DMSO 3 1 0CCCCAACCCTTTGCTCAGCG 468. TGAGGAGAACACCACAGGCAGA 780. DMSO 1 2 1ATCGACGAGGAGGGGGCCTT 469. CCCCTCACTCAAGCAGGCCC 781. DMSO 0 3 1TGCTCAAGGGGCCTGTTCCA 470. CAGGGGCAGTGGCAGGAGTC 782. No DMSO 1 3 0TGCCTGGCACGCAGTAGGTG 471. GGGAAGGGGGAACAGGTGCA 783. DMSO 0 0 5Not optimized 1 1 3 ACCTGGGCTTGCCACTAGGG 472. GCTGCTCGCAGTTAAGCACCA 784.DMSO 1 3 1 GTGGCCGGGCTACTGCTACC 473. GGTTCCACAAGCTGGGGGCA 785. DMSO 3 20 Not optimized 1 3 1 GCAAGAGGCGGAGGAGACCC 474. AGAGTCATCCATTTCCTGGGGG786. DMSO 2 3 0 C GGGGTCAGTGGTGATATCCCCCT 475. AGGGAATCCTTTTTCCATTGCT787. 1M betaine, 1 4 0 TGTTT TD AGAGAGGCCACGTGGAGGGT 476.GCCTCCCCTCCTCCTTCCCA 788. DMSO 1 3 1 GACAGTGCCTTGCGATGCAC 477.TCTGACCGGTATGCCTGACG 789. DMSO 3 2 0 TGTGTGAACGCAGCCTGGCT 478.TGGTCTAGTACTTCCTCCAGCC 790. DMSO 3 1 1 TT GGTTCTCCCTTGGCTCCTGTGA 479.CCCACTGCTCCTAGCCCTGC 791. DMSO 1 3 1 TGAAGTCAACAATCTAAGCTTCCACCT 480.AGCTTTGGTAGTTGGAGTCTTT 792. DMSO 3 1 2 GAAGG TGATTGGGCTGCAGTTCATGTACA481. GCACAGCCTGCCCTTGGAAG 793. DMSO 2 1 3 TCCATGGGCCCCTCTGAAAGA 482.AGCGGCTTCTGCTTCTGCGA 794. DMSO 1 0 5 GCGGTTGGTGGGGTTGATGC 483.GAGTTCCTCCTCCCGCCAGT 795. DMSO 2 0 4 AGGCAAGATTTTCCAGTGTGCAAGA 484.GCTTTTGCCTGGGACTCCGC 796. DMSO 2 0 4 GCTGCTGGTCGGGCTCTCTG 485.GCTCTGTCCCACTTCCCCTGG 797. No DMSO TD 3 1 2 GCTGCGAGGCTTCCGTGAGA 486.CGCCCCTAGAGCTAAGGGGGT 798. DMSO 3 2 1 CCAGGAGCCTGAGAGCTGCC 487.AGGGCTAGGACTGCAGTGAGC 799. DMSO 1 3 2 CTGTGCTCAGCCTGGGTGCT 488.GCCTGGGGCTGTGAGTAGTTT 800. DMSO 2 3 1 AGCTCGCGCCAGATCTGTGG 489.ACTTGGCAGGCTGAGGCAGG 801. 72 C. Anneal, 4 2 0 3% DMSOAGAGAAGTCGAGGAAGAGAGAG 490. CAGCAGAAAGTTCATGGTTTCG 802. DMSOTGGACAGCTGCAGTACTCCCTG 491. ACTGATCGATGATGGCCTATGGG 803. DMSO 0 0 2 TCAAGATGTGCACTTGGGCTA 492. GCAGCCTATTGTCTCCTGGT 804. DMSO 1 0 1GTCCAGTGCCTGACCCTGGC 493. AGCATCATGCCTCCAGCTTCA 805. DMSO 1 1 1GCTCCCGATCCTCTGCCACC 494. GCAGCTCCCACCACCCTCAG 806. DMSO 1 2 0GGGGACAGGCAGGCAAGGAG 495. GTGCGTGTCCGTTCACCCCT 807. DMSO 1 1 1AAGGGGCTGCTGGGTAGGAC 496. CGTGATTCGAGTTCCTGGCA 808. DMSO 2 1 0GACCCTCAGGAAGCTGGGAG 497. CTGCGAGATGCCCCAAATCG 809. 1M betaine, 1 0 2 TDCCGCGGCGCTCTGCTAGA 498. TGCTGGGATTACAGGCGCGA 810. DMSO 1 1 1CCAGGTGGTGTCAGCGGAGG 499. TGCCTGGCCCTCTCTGAGTCT 811. DMSO 0 2 1CGACTCCACGGCGTCTCAGG 500. CAGCGCAGTCCAGCCCGATG 812. 1M betaine, 2 1 0 TDCTTCCCTCCCCCAGCACCAC 501. GCTACAGGTTGCACAGTGAGAGG 813. DMSO 1 1 1 TCCCCGGGGAGTCTGTCCTGA 502. CCCAGCCGTTCCAGGTCTTCC 814. 72 C. Anneal, 1 0 23% DMSO GAAGCGCGAAAACCCGGCTC 503. TCCAGGGTCCTTCTCGGCCC 815. DMSO 1 0 2AGGGTGGTCAGGGAGGCCTT 504. CATGGGGCTCGGACCTCGTC 816. DMSO 2 0 1GGGAAGAGGCAGGGCTGTCG 505. TGCCAGGAAGGAAGCTGGCC 817. 72 C. Anneal, 0 2 13% DMSO GAGTGACGATGAGCCCCGGG 506. CCCTTAGCTGCAGTCGCCCC 818. 68 C. Anneal0 1 3 3% DMSO, CCCATGAGGGGTTTGAGTGC 507. TGAAGATGGGCAGTTTGGGG 819. DMSO0 2 2 CACCTGGGGCATCTGGGTGG 508. ACTGGGGTTGGGGAGGGGAT 820. DMSO 2 0 2TCATGATCCCCAAAAGGGCT 509. CCATTTGTGCTGATCTGTGGGT 821. DMSO 1 0 3TGGTGCCCAGAATAGTGGCCA 510. AGGAAATGTGTTGTGCCAGGGC 822. DMSO 1 2 1GCCTCAGACAACCCTGCCCC 511. GCCAAGTGTTACTCATCAAGAAA 823. No DMSO TD 2 1 1GTGG GCCGGGACAAGACTGAGTTGGG 512. TCCCGAACTCCCGCAAAACG 824. DMSO 1 2 1TGCTGCAGGTGGTTCCGGAG 513. CTGGAACCGCATCCTCCGCA 825. No DMSO TD 1 0 3ACACTGGTCCAGGTCCCGTCT 514. GGCTGTGCCTTCCGATGGAA 826. DMSO 2 1 1CTCTCCCCCCACCCCCCC ATCGCGCCCAAAGCACAGGT 515. AGGCTTCTGGAAAAGTCCTCAAT827. DMSO 3 0 2 TCTGG GCA (SEQ ID NO: 425) Not optimized 1 1 2CCCTCATGGTGGTCTTACGGCA 516. AGCCACACATCTTTCTGGTAGGG 828. DMSO 1 1 2TGCGTCGCTCATGCTGGGAG 517. AGGGTGGGGTGTACTGGCTCA 829. DMSO 0 3 1GAGCTGAGACGGCACCACTG 518. TGGCCTTGAACTCTTGGGCT 830. 1M betaine, 0 1 3 TDNot optimized 1 2 1 AGTGAGAGTGGCACGAACCA 519. CAGTAGGTGGTCCCTTCCGC 831.DMSO 2 1 1 Not optimized 832. 1 1 3 GGGAGAACCTTGTCCAGCCT 520.AAGCCGAAAAGCTGGGCAAA 833. DMSO 0 2 3 CTTCCCAGTGTGGCCCGTCC 521.ACACAGTCAGAGCTCCGCCG 834. DMSO 1 1 3 Not optimized 1 0 4CTGAGAGGGGGAGGGGGAGG 522. TCGACTGGTCTTGTCCTCCCA 835. 68 C. Anneal, 3 0 23% DMSO CAGCCTGCTGCATCGGAAAA 523. TGCAGCCAAGAGAAAAAGCCT 836. 1M betaine,1 0 4 TD TCCCTCTGACCCGGAACCCA 524. ACCCGACTTCCTCCCCATTGC 837. DMSO 2 1 2TGGGGGTTGCGTGCTIGICA 525. GCCAGGAGGACACCAGGACC 838. DMSO 4 1 0ATCAGGTGCCAGGAGGACAC 526. GGCCTGAGAGTGGAGAGTGG 839. DMSO 4 1 0Not optimized 1 4 0 TGAGCCACATGAATCAAGGCCTCC 527.ACCTCTCCAAGTCTCAGTAACTCT 840. DMSO 1 3 1 CT GGTCCCTCTGTGCAGTGGAA 528.CTTTGGTGGACCTGCACAGC 841. DMSO 2 2 2 GCGAGGCTGCTGACTTCCCT 529.GCTGGGACTACAGACATGTGCCA 842. DMSO 2 2 2 ATTTCCTCCCCCCCC-ATTGCAGGCGTGTCCAGGCA 530. AAATCCTGCATGGTGATGGGAGT 843. DMSO 1 1 5CCTCAGG (SEQ ID NO: 426) TGCTCTGCCATTTATGTCCTATGAACT 531.ACAGCCTCTTCTCCATGACTGAGC 844. DMSO 1 3 2 TCCGCCCAAACAGGAGGCAG 532.GCGGTGGGGAAGCCATTGAG 845. DMSO 2 3 1 GGGGGTCTGGCTCACCTGGA 533.CCTGTCGGGAGAGTGCCTGC 846. DMSO 3 1 2 TCCTGGTTCATTTGCTAGAACTCTGGA 534.ACTCCAGATGCAACCAGGGCT 847. DMSO 3 2 1 CGTGTGGTGAGCCTGAGTCT 535.GCTTCACCGTAGAGGCTGCT 848. DMSO 3 0 3 AGGCCCTGATAATTCATGCTACCAA 536.TCAGTGACAACCTTTTGTATTCGG 849. DMSO 0 2 4 CA Not optimized 2 2 2 537.TCCAGATGGCACATTGTCAG 538. AGGGAGCAGGAAAGTGAGGT 850. DMSOGCAGGCAAGCTGTCAAGGGT 539. CACCGACACACCCACTCACC 851. DMSO 0 0 1GAGGGGGAAGTCACCGACAA 540. TACCCGGGCCGTCTGTTAGA 852. DMSO 0 0 2GACACCCCACACACTCTCATGC 541. TGAATCCCTTCACCCCCAAG 853. DMSO 1 0 1TCCTTTGAGGTTCATCCCCC 542. CCAATCCAGGATGATTCCGC 854. DMSO 1 0 1CAGGGCCAGGAACACAGGAA 543. GGGAGGTATGTGCGGGAGTG 855. DMSO 1 1 0TGCAGCCTGAGTGAGCAAGTGT 544. GCCCAGGTGCTAAGCCCCTC 856. DMSO 1 0 1TACAGCCTGGGTGATGGAGC 545. TGTGTCATGGACTTTCCCATTGT 857. 1M betaine, 1 1 0TD GGCAGGCATTAAACTCATCAGGTCC 546. TCTCCCCCAAGGTATCAGAGAGCT 858. DMSO 1 10 GGGCCTCCCTGCTGGTTCTC 547. GCTGCCGTCCGAACCCAAGA 859. DMSO 0 1 1ACAAACGCAGGTGGACCGAA 548. ACTCCGAAAATGCCCCGCAGT 860. DMSO 1 1 0AGGGGAGGGGACATTGCCT 549. TTGAGAGGGTTCAGTGGTTGC 861. DMSO 1 0 1CTAATGCTTACGGCTGCGGG 550. AGCCAACGGCAGATGCAAAT 862. DMSO 1 0 1GAGCGAAGTTAACCCACCGC 551. CACACATGCACATGCCCCTG 863. 68 C., 3% 2 0 0 DMSOGCATGTGTCTAACTGGAGACAATAGCA 552. TCCCCCATATCAACACACACA 864. DMSO 2 0 0GCCCCTCCCGCCTTTTGTGT 553. TGGGCAAAGGACATGAAACAGAC 865. DMSO 2 0 0 AGCCTCAGCTCTGCTCTTAAGCCC 554. ACGAACAGATCATTTTTCATGGCT 866. DMSO 2 0 0TCC CTCCAGAGCCTGGCCTACCA 555. CCCTCTCCGGAAGTGCCTTG 867. DMSO 0 1 1TCTGTCACCACACAGTTACCACC 556. GTTGCCTGGGGATGGGGTAT 868. DMSO 0 1 1GGGGACCCTCAAGAGGCACT 557. GGGCATCAAAGGATGGGGAT 869. DMSO 2 0 1TGTGGAGGGTGGGACCTGGT 558. ACAGTGAGGTGCGGTCTTTGGG 870. DMSO 1 0 2CGGGGTGGCAGTGACGTCAA 559. GGTGCAGTCCAAGAGCCCCC 871. DMSO 0 0 3AGCTGAGGCAGAGTCCCCGA 560. GGGAGACAGAGCAGCGCCTC 872. DMSO 1 1 1ACCACCAGACCCCACCTCCA 561. AGGACGACTTGTGCCCCATTCA 873. 72 C. Anneal, 1 11 3% DMSO GGGTCAGGACGCAGGTCAGA 562. TCCACCCACCCACCCATCCT 874.72 C. Anneal, 2 0 1 3% DMSO ACACTCTGGGCTAGGTGCTGGA 563.GCCCCCTCACCACATGATGCT 875. DMSO 2 0 1 GGGGCCATTCCTCTGCTGCA 564.TGGGGATCCTTGCTCATGGC 876. DMSO 3 0 0 ACACACTGGCTCGCATTCACCA 565.CCTGCACGAGGCCAGGTGTT 877. DMSO 2 1 0 TGGGCACGTAGTAAACTGCACCA 566.CTCGCCGCCGTGACTGTAGG 878. DMSO 0 3 1 TCAGCTGGTCCTGGGCTTGG 567.AGAGCACTGGGTAGCAGTCAGT 879. DMSO 2 1 0 AGACACAGCCAGGGCCTCAG 568.GGTGGGCGTGTGTGTGTACC 880. 68 C., 3% 1 1 1 DMSO ACACTCTCACACACGCACCAA569. GAGAAGTCAGGGCTGGCGGG 881. 72 C. Anneal, 1 2 0 3% DMSOACTGCCTGCATTTCCCCGGT 570. TGGTGAGGGCTTCAGGGAGC 882. DMSO 1 1 1GCCAGGTTCATTGACTGCCC 571. TCCTTCTACACATCGGCGGC 883. DMSO 2 1 0CGAGGGAGCCGAGTTCGTAA 572. CTGACCTGGGGCTCTGGTAC 884. DMSO 1 2 0TCCTCGGGAAGTCATGGCTTCA 573. GCACTGAGCAACCAGGAGCAC 885. DMSO 2 1 0Not optimized 1 0 3 TAAACCGTTGCCCCCGCCTC 574. GCTCCCCTGCCAGGTGAACC 886.DMSO 2 1 1 CCTGCTGAGACTCCAGGTCC 575. CTGCGGAGTGGCTGGCTATA 887. DMSO 2 02 CTCGGGGACTGACAAGCCGG 576. GGAGCAGCTCTTCCAGGGCC 888. DMSO 3 0 1CCCCGACCAAAGCAGGAGCA 577. CTGGCAGCCTCTGGATGGGG 889. DMSO 1 2 1Not optimized 0 3 1 ATTTCAGAGCCCCGGGGAAA 578. AGGCCGCGGTGTTATGGTTA 890.DMSO 1 2 1 GCCAGTGGCTTAGTGTCTTTGTGT 579. TGACATATTTTCCTGGGCCATGGG 891.DMSO 2 1 1 T TGCCAGAAGAACATGGGCCAGA 580. CCATGCTGACATCATATACTGGGA 892.DMSO 3 1 0 AGC GCGTGTCTCTGTGTGCGTGC 581. CCAGGCTGGGCACACAGGTT 893. DMSO3 1 0 Not optimized 2 2 0 TGCCCAGTCCAATATTTCAGCAGCT 582.AGGATGAGTTCATGTCCTTTGTG 894. DMSO 2 2 0 GGG GGGTGAAAATTTGGTACTGTTAGCTGT583. AATGACTCATTCCCTGGGTATCTC 895. DMSO 2 2 0 CCA TGCCCCATCAATCACCTCGGC584. CAAGGTCGGCAGGGCAGTGA 896. DMSO 1 2 2 GCCTCCTCTGCCGCTGGTAA 585.TGAGAGTTCCTGTTGCTCCACACT 897. DMSO 1 2 2 Not optimized 2 2 1GCCACCAAAATAGCCAGCGT 586. ACATGCATCTGTGTGTGCGT 898. DMSO 3 0 2ACAGACTGACCCTTGAAAAATACCAGT 587. TGTATCTTTCTTGCCAATGGTTTTC 899. DMSO 2 12 CC AGCCAAATTTCTCAACAGCAGCACT 588. TCCTGGAGAGCAGGCATTTTTGT 900. DMSO 31 1 ACCTCCTTGTGCTGCCTGGC 589. GGCGGGAAGGTAACCCTGGG 901. DMSO 2 1 2CACAAAGCTCTACCTTTCCAGTAGTGT 590. TGATCCGATGGTTGTTCACAGCT 902. DMSO 3 1 1TGTGGGGATTACCTGCCTGGC 591. ACGCACAAAAATGCCCTTGTCA 903. DMSO 2 2 1TGAGGCAGACCAGTCATCCAGC 592. GCCCGAGCACAGTGTAGGGC 904. DMSO 2 3 0ATTAGCTGGGCGTGGCGGAG 593. ACTGCATCTCATCTCAGGCAGCT 905. DMSO 2 1 3TGAAGCAGAAGGAGTGGAGAAGGA 594. TCAGCTTCACATCTGTTTCAGTTC 906. DMSO 4 0 2AGT TGGTGGAGTGTGTGTGTGGT 595. AGAGCAGAAAGAGAGTGCCCA 907. DMSO 1 3 2GCCCCTGTACGTCCTGACAGC 596. TGCACAAGCCACTTAGCCTCTCT 908. DMSO 3 1 2AGCGCAGGTAAACAGGCCCA 597. TCTCTCGCCCCGTTTCCTTGT 909. DMSO 3 1 2ATGGGTGCCAGGTACCACGC 598. ACAGCAGGAAGGAGCCGCAG 910. DMSO 2 3 1CGGGCGGGTGGACAGATGAG 599. AGGAGGTCTCGAGCCAGGGG 911. DMSO 2 3 1TCAACCTAGTGAACACAGACCACTGA 600. GTCTATATACAGCCCACAACCTCA 912. DMSO 1 2 3TGT GCCAGGGCCAGTGGATTGCT 601. TGTCATTTCTTAGTATGTCAGCCG 913. DMSO 2 4 0GA GAGCCCCACCGGTTCAGTCC 602. GCCAGAGCTACCCACTCGCC 914. DMSO 1 3 2 603.GGAGCAGCTGGTCAGAGGGG 604. GGGAAGGGGGACACTGGGGA 915. DMSOTCTCTCCTTCAACTCATGACCAGCT 605. ATCTGCACATGTATGTACAGGAG 916. DMSO 0 1 1TCAT AAGACAGAGGAGAAGAAG TGGGGAATCTCCAAAGAACCCCC 606.AGGGTGTACTGTGGGAACTTTGC 917. DMSO 2 1 1 AAGGG A (SEQ ID NO: 427)GATGGCCCCACTGAGCACGT 607. ACTTCGTAGAGCCTTAAACATGTG 918. DMSO 1 0 2 GCAGGATTAATGTTTAAAGTCACTGGTGG 608. TCAAACAAGGTGCAGATACAGCA 919.1M betaine, 1 0 2 TD TCCAAGCCACTGGTTTCTCAGTCA 609.TGCTCTGTGGATCATATTTTGGGG 920. DMSO 0 1 2 GA ACTTTCAGAGCTTGGGGCAGGT 610.CCCACGCTGAAGTGCAATGGC 921. DMSO 1 1 1 CAAAGCATGCCTTTCAGCCG 611.GGCTCTTCGATTTGGCACCT 922. 1M betaine, 1 1 1 TD Not optimized 1 0 2GGACTCCCTGCAGCTCCAGC 612. AGGAACACAGGCCAGGCTGG 923. 72 C. Anneal, 0 0 36% DMSO CCCTTTAGGCACCTTCCCCA 613. CCGACCTTCATCCCTCCTGG 924. DMSO 0 1 2TGATTCTGCCTTAGAGTCCCAGGT 614. TGGGCTCTGTGTCCCTACCCA 925. DMSO 0 3 0Not optimized 2 1 0 AGGCAGGAGAGCAAGCAGGT 615. ACCCTGACTACTGACTGACCGCT926. DMSO 0 1 2 CTCCCCATTGCGACCCGAGG 616. AGAGGCATTGACTTGGAGCACCT 927.DMSO 1 2 0 CTGGAGCCCAGCAGGAAGGC 617. CCTCAGGGAGGGGGCCTGAT 928. DMSO 1 20 ACTGTGGGCGTTGTCCCCAC 618. AGGTCGGTGCAGGGTTTAAGGA 929. DMSO 1 0 3GGCGCTCCCTTTTTCCCTTTGT 619. CGTCACCCATCGTCTCGTGGA 930. DMSO 2 0 2TGCCATCTATAGCAGCCCCCT 620. GCATCTTGCTAACCGTACTTCTTC 931. DMSO 1 0 3 TGAGTGGAGACGCTAAACCTGTGAGGT 621. GCTCCTGGCCTCTTCCTACAGC 932. DMSO 1 2 1CCGAACTTCTGCTGAGCTTGATGC 622. CCAAGTCAATGGGCAACAAGGGA 933. DMSO 0 2 2Not optimized 1 1 2 TGCCCCCAAGACCTTTCTCC 623. ATGGCAGGCAGAGGAGGAAG 934.DMSO 2 0 2 GGGTGGGGCCATTGTGGGTT 624. CTGGGGCCAGGGTTTCTGCC 935. DMSO 3 01 TGGAGAACATGAGAGGCTTGCAA 625. TCCTTCTGTAGGCAATGGGAACA 936. DMSO 3 0 1 AGCCACATGGTAGAAGTCGGC 626. GGCAGATTTCCCCCATGCTG 937. 1M betaine, 1 2 1 TDTGTACACCCCAAGTCCTCCC 627. AAGGGGAGTGTGCAAGCCTC 938. DMSO 3 1 0AGGTCTGGCTAGAGATGCAGCA 628. AGTCCAACACTCAGGTGAGACCC 939. DMSO 3 1 0 TCCAAGAGGACCCAGCTGTTGGA 629. GGGTATGGAATTCTGGATTAGCA 940. DMSO 0 2 2 GAGCACCATCTCTTCATTGATGAGTCCCAA 630. ACACTGTGAGTATGCTTGGCGT 941. DMSO 2 2 0GGCTGCGGGGAGATGAGCTC 631. TCGGATGCTTTTCCACAGGGCT 942. DMSO 2 2 1TCTTCCAGGAGGGCAGCTCC 632. CCAATCCTGAGCTCCTACAAGGCT 943. DMSO 1 0 4GAGCTGCACTGGATGGCACT 633. TGCTGGTTAAGGGGTGTTTTGGA 944. DMSO 1 1 3TCTGGGAAGGTGAGGAGGCCA 634. TGGGGGACAATGGAAAAGCAAT 945. DMSO 0 2 3 GACTTGCTCCCAGCCTGACCCC 635. AGCCCTTGCCATGCAGGACC 946. DMSO 3 1 1GGGATTTTTATCTGTTGGGTGCGAA 636. AACCACAGATGTACCCTCAAAGCT 947. DMSO 2 2 1ACCCATCAGGACCGCAGCAC 637. TCTGGAACCTGGGAGGCGGA 948. 72 C. Anneal, 3 1 13% DMSO CGTCCCTCACAGCCAGCCTC 638. CCTCCTTGGGCCTGGGGTTC 949. DMSO 1 3 1CCCTCTGCAAGGTGGAGTCTCC 639. AGATGTTCTGTCCCCAGGCCT 950. DMSO 1 3 1GGCTTCCACTGCTGAAGGCCT 640. TGCCGCTCCACATACCCTCC 951. DMSO 2 1 2AGCATTGCCTGTCGGGTGATGT 641. AGCACCTATTGGACACTGGTTCTC 952. DMSO 1 3 1 TTCTAGAGCAGGGGCACAATGC 642. TGGAGATGGAGCCTGGTGGGA 953. DMSO 2 2 1GGTCTCAGAAAATGGAGAGAAAGCACG 643. CCCACAGAAACCTGGGCCCT 954. DMSO 1 2 3GGTTGCTGATACCAAAACGTTTGCCT 644. TGGGTCCTCTCCACCTCTGCA 955. DMSO 0 3 3ACTCTCCTTAAGTACTGATATGGCTGT 645. CAGAATCTTGCTCTGTTGCCCA 956. DMSO 0 4 2Not optimized 2 2 2 Not optimized 2 2 2 CAATGCCTGCAGTCCTCAGGA 646.TCCCAAGAGAAAACTCTGTCCTGA 957. DMSO 4 1 1 CA GCATTGGCTGCCCAGGGAAA 647.TGGCTGTGCTGGGCTGTGTT 958. DMSO 2 2 2 CCACAAGCCTCAGCCTACCCG 648.ACAGGTGCCAAAACACTGCCT 959. DMSO 2 1 3 TCATTGCAGCAGAAGAAGGCCTCTTGCAAATGAGACTCCTTT 649. CGATCAGTCCCCTGGCGTCC 960. DMSO 2/1 2/3 2AAAGG TCATTGTAGCAGAAGAAG AAAGG (SEQ ID NO: 428) TCCCAGAATCTGCCTCCGCA650. AGGGGTTTCCAGGCACATGGG 961. DMSO 0 4 2 651. 962.TCCTAAAAATCAGTTTTGAGATTTACTTCC 652. AAAGTGTTAGCCAACATACAGAA 963. DMSOGTCAGGA GGTATCTAAGTCATTACC ACATCTGGGGAAAGCAAAAGTCAACA 653.TGTCTGAGTATCTAGGCTAAAAG 964. DMSO 1/2 1 1 TGTGG TGGT GGTATCTAAGTCAATACCTGTGG (SEQ ID NO: 429) ACGATCTTGUTCATTTCCCTGTACA 654.AGTGCTTTGTGAACTGAAAAGCA 965. DMSO 0 3 0 AACA GCACCTTGGTGCTGCTAAATGCC655. GGGCAACTGAACAGGCATGAATG 966. DMSO 1 2 0 G AACTGTCCTGCATCCCCGCC 656.GGTGCACCTGGATCCACCCA 967. DMSO 1 1 1 Not optimized 1 1 1CATCACCCTCCACCAGGCCC 657. ACCACTGCTGCAGGCTCCAG 968. 72 C. Anneal, 0 3 03% DMSO Not optimized 2 0 2 CCTGACCCGTGGTTCCCGAC 658.TGGTGCGTGGTGTGTGTGGT 969. 72 C. Anneal, 1 2 1 3% DMSOTGGGAACATTGGAGAAGTTTCCTGA 659. CCATGTGACTACTGGGCTGCCC 970. DMSO 1 1 2AGCCTTGGCAAGCAACTCCCT 660. GGTTCTCTCTCTCAGAAAAGAAA 971. DMSO 1 0 3 GAGGGGCAGCGGACTTCAGAGCCA 661. GCCAGAGGCTCTCAGCAGTGC 972. DMSO 1 0 3CCAGCCTGGTCAATATGGCA 662. ACTGTGCCCAGCCCCATATT 973. DMSO 2 1 1ATGCCAACACTCGAGGGGCC 663. CGGGTTGTGGCACCGGGTTA 974. DMSO 2 1 1TTGCTCTAGTGGGGAGGGGG 664. AGAGTTCAGGCATGAAAAGAAGC 975. DMSO 3 0 1 AACAAGCTGAAGATAGCAGTGTTTAAGCCT 665. TGCAATTTGAGGGGCTCTCTTCA 976. DMSO 1 1 2AGTCACTGGAGTAAGCCTGCCT 666. TGCCAGCCAAAAGTTGTTAGTGT 977. DMSO 2 0 2 GTGGGTCTCCCTCAGTGCCCTG 667. TGTGTGGTAGGGAGCAAAACGAC 978. DMSO 2 0 2 ATGGGGGCTGTTAAGAGGCACA 668. TGACCACACACACCCCCACG 979. DMSO 1 2 1TCAAAACAGATTGACCAAGGCCAAAT 669. TGTGTTTTTAAGCTGCACCCCAGG 980. DMSO 1 0 3TCTGGCACCAGGACTGATTGTACA 670. GCACGCAGCTGACTCCCAGA 981. DMSO 1 2 1Not optimized 1 0 3 AGCATCTGTGATACCCTACCTGTCT 671. ACCAGGGCTGCCACAGAGTC982. DMSO 1 0 3 TAGTCTTGTTGCCCAGGCTG 672. CTCGGCCCCTGAGAGTTCAT 983. DMSO1 2 1 TCCATCTCACTCATTACC CTGCAACCAGGGCCCTTACC 673. GAGCAGCAGCAAAGCCACCG984. DMSO 1 1 2 TGAGGTCCATCTCACTCA TTACCTGATG (SEQ ID NO: 430)GCCTGGAGAGCAAGCCTGGG 674. AGCCGAGACAATCTGCCCCG 985. DMSO 1 1 2TTTATATTAGTGATTACC AGTGAAACAAACAAGCAGCAGTCTGA 675. GGCAGGTCTGACCAGTGGGG986. No DMSO TD 1 2 1 TGCGG (SEQ ID NO: 431) AGGCTCAGAGAGGTAAGCAATGGA676. TGAGTAGACAGAAATGTTACCGG 987. DMSO 3 0 2 TGTTTCAGAGATGTTAAAGCCTTGGTGGG 677. AGTGAACCAAGGGAATGGGGGA 988. DMSO 3 0 2TGTGCTTTCTGGGGTAGTGGCA 678. CACCTCAGCCCTGTAGTCCTGG 989. DMSO 0 4 1CCATTGGGTGACTGAATGCACA 679. GCCACTGTCCCCAGCCTATT 990. 1M betaine, 1 3 1TD ACCAAGAAAGTGAAAAGGAAACCC 680. TGAGATGGCATACGATTTACCCA 991. DMSO 1 2 2AGGGTGGGGACTGAAAGGAGCT 681. TGGCATCACTCAGAGATTGGAAC 992. DMSO 3 1 1 ACAACCAGTGCTGTGTGACCTTGGA 682. TCCTATGGGAGGGGAGGCTTCT 993. DMSO 3 1 1CCAGGTGTGGTGGTTCATGAC 683. GCATACGGCAGTAGAATGAGCC 994. 68 C., 3% 4 0 1DMSO CAGGCGCTGGGTTCTTAGCCT 684. CCTTCCTGGGCCCCATGGTG 995. DMSO 2 3 0TGGGGTCCAAGATGTCCCCT 685. TGAAACTGCTTGATGAGGTGTGG 996. DMSO 1 2 2 AGCTGGGCTTGGTGGTATATGC 686. ACTTGCAAAGCTGATAACTGACT 997. DMSO 5 0 1 GAAGTTGGTGTCACTGACAATGGGA 687. CGCAGCGCACGAGTTCATCA 998. DMSO 3 0 3AGAGGAGGCACAATTCAACCCCT 688. GGCTGGGGAGGCCTCACAAT 999. DMSO 1 1 4GGGAAAGTTTGGGAAAGTCAGCA 689. AGGACAAGCTACCCCACACC 1000. DMSO 1 3 2TGGTGCATCAAAGGGTTGCTTCT 690. TCATTCCAGCACGCCGGGAG 1001. DMSO 0 3 3CCCAGGCTGCCCATCACACT 691. TGGAGTAAGTATACCTTGGGGAC 1002. DMSO 1 3 2 CTTCAGTGCCCCTGGGTCCTCA 692. TGTGCAAATACCTAGCACGGTGC 1003. DMSO 4 2 0AGCACTCCCTTTTGAATTTTGGTGCT 693. ACTGAAGTCCAGCCTCTTCCATTT 1004. DMSO 2 13 CA GAAACCGGTCCCTGGTGCCA 694. GGGGAGTAGAGGGTAGTGTTGC 1005. DMSO 2 0 4 CTTGCGGGTCCCTGTGGAGTC 695. AGGTGCCGTGTTGTGCCCAA 1006. DMSO 1 2 3 696.1007. GCCCTACATCTGCTCTCCCTCCA 697. GGGCCGGGAAAGAGTTGCTG 1008. DMSOTTGGAGTGTGGCCCGGGTTG 698. ACCTCTCTTTCTCTGCCTCACTGT 1009. DMSO 0 1 1CACACCATGCTGATCCAGGC 699. GCAGTACGGAAGCACGAAGC 1010. DMSO 1 1 1CTCCAGGGCTCGCTGTCCAC 700. CTGGGCTCTGCTGGTTCCCC 1011. DMSO 0 2 1CTGTGGTAGCCGTGGCCAGG 701. CCCCATACCACCTCTCCGGGA 1012. DMSO 0 2 1GGTGGCGGGACTTGAATGAG 702. CCAGCGTGTTTCCAAGGGAT 1013. 1M betaine, 0 1 2TD GGAATCCCCTCTCCAGCC CCAGAGGTGGGGCCCTGTGA 703. TTTCCACACTCAGTTCTGCAGGA1014. DMSO 1 1 1/2 CCTGG GGAATCCCCTCTCCAGCC TCTGG (SEQ ID NO: 432)GGAATCTCTTCCTTGGCA TGTGACTGGTTGTCCTGCTTTCCT 704. GCAGTGTTTTGTGGTGATGGGCA1015. 1M betaine 0 1 5 TCTGG TD (SEQ ID NO: 433) CTGGCCAAGGGGTGAGTGGG705. TGGGACCCCAGCAGCCAATG 1016. DMSO 1 0 2 ACGGTGTGCTGGCTGCTCTT 706.ACAGTGCTGACCGTGCTGGG 1017. DMSO 1 1 1 TGGTTTGGGCCTCAGGGATGG 707.TGCCTCCCACAAAAATGTCTACCT 1018. DMSO 0 0 3 TGGTTTGGGCCTCAGGGATGG 708.ACCCCTTATCCCAGAACCCATGA 1019. DMSO 0 0 3 TCCAAGTCAGCGATGAGGGCT 709.TGGGAGCTGTTCCTTTTTGGCCA 1020. DMSO 0 3 0 CACCCCTCTCAGCTTCCCAA 710.GCTAGAGGGTCTGCTGCCTT 1021. DMSO 1 2 0 AGACCCCTTGGCCAAGCACA 711.CTTGCTCTCACCCCGCCTCC 1022. DMSO 2 1 0 ACATGTGGGAGGCGGACAGA 712.TCTCACTTTGCTGTTACCGATGTC 1023. DMSO 0 1 3 G GGACGACTGTGCCTGGGACA 713.AGTGCCCAGAGTGTTGTAACTGC 1024. 72 C. Anneal 0 1 3 T 3% DMSO,GGAGAGCTCAGCGCCAGGTC 714. CAGCGTGGCCCGTGGGAATA 1025. DMSO 1 1 2GCTGAAGTGCTCTGGGGTGCT 715. ACCCCACTGTGGATGAATTGGTA 1026. DMSO 1 1 2 CCTCGGGGTGCACATGGCCATC 716. TTGCCTCGCAGGGGAAGCAG 1027. DMSO 0 1 3CTCGTGGGAGGCCAACACCT 717. AGCCACCAACACATACCAGGCT 1028. DMSO 2 0 2GCATGCCTTTAATCCCGGCT 718. AGGATTTCAGAGTGATGGGGCT 1029. DMSO 2 1 1CGCCCAGCCACAAAGTGCAT 719. GCAAATTTCTGCACCTACTCTAGG 1030. DMSO 1 1 2 CCTAGCTCACAAGAATTGGAGGTAACAGT 720. GCAGTCACCCTTCACTGCCTGT 1031. DMSO 1 1 2AAACTGGGCTGGGCTTCCGG 721. GGGGCTAAGGCATTGTCAGACCC 1032. DMSO 2 0 2GCAGGTAGGCAGTCTGGGGC 722. TCTCCTGCCTCAGCCTCCCA 1033. 1M betaine, 1 2 1TD GCAGGTAGGCAGTCTGGGGC 723. TCTCCTGCCTCAGCCTCCCA 1034. 1M betaine, 1 21 TD GCAGGTAGGCAGTCTGGGGC 724. TCTCCTGCCTCAGCCTCCCA 1035. 1M betaine, 12 1 TD GCTCTGGGGTAGAAGGAGGC 725. GGCCTGTCAACCAACCAACC 1036. DMSO 2 2 0TGACATGTTGTGTGCTGGGC 726. AAATCCTGCAGCCTCCCCTT 1037. DMSO 0 2 2TCCTGGTGAGATCGTCCACAGGA 727. TCCTCCCCACTCAGCCTCCC 1038. DMSO 0 3 1TCCTAATCCAAGTCCTTTGTTCAGACA 728. AGGGACCAGCCACTACCCTTCA 1039. DMSO 2 2 0GGGACACCAGTTCCTTCCAT 729. GGGGGAGATTGGAGTTCCCC 1040. DMSO 1 0 4ACACCACTATCAAGGCAGAGTAGGT 730. TCTGCCTGGGGTGCTTTCCC 1041. DMSO 1 1 3CTGGGAGCGGAGGGAAGTGC 731. GCCCCGACAGATGAGGCCTC 1042. DMSO 1 2 2CAGATTACTGCTGCAGCA CGGGTCTCGGAATGCCTCCA 732. ACCCAGGAATTGCCACCCCC 1043.DMSO 1 2 3 CCGGG (SEQ ID NO: 434) TTGCTGTGGTCCCGGTGGTG 733.GCAGACACTAGAGCCCGCCC 1044. DMSO 3 2 0 GGTGTGGTGACAGGTCGGGT 734.ACCTGCGTCTCTGTGCTGCA 1045. DMSO 2 3 0 CTCCCAGGACAGTGCTCGGC 735.CCTGGCCCCATGCTGCCTG 1046. DMSO 2 2 1 TGCGTAGGTTTTGCCTCTGTGA 736.AGGGAATGATGTTTTCCACCCCCT 1047. DMSO 2 3 0 CTCCGCAGCCACCGTTGGTA 737.TGCATTGACGTACGATGGCTCA 1048. DMSO 1 3 1 ACCTGCAGCATGAACTCTCGCA 738.ACCTGAGCAACATGACTCACCTG 1049. DMSO 2 1 2 G ACACAAACTTCTGCAGCATCTCCAGTTTCTTGCTCTCATGG 739. ACCATTGGTGAACCCAGTCA 1050. 1M betaine, 3/23 1 CCTGG TD ACACAAACTTCTGCAGCA CGTGG (SEQ ID NO: 435)TGGGGTGGTGGTCTTGAATCCA 740. TCAGCTATAACCTGGGACTTGTGC 1051. DMSO 2 1 3 TAGCAGCCAGTCCAGTGTCCTG 741. CCCTTTCATCGAGAACCCCAGGG 1052. DMSO 3 1 2TGGACGCTGCTGGGAGGAGA 742. GAGGTCTCGGGCTGCTCGTG 1053. DMSO 0 3 3AGGTTTGCACTCTGTTGCCTGG 743. TGGGGTGATTGGTTGCCAGGT 1054. DMSO 3 2 1TCTTCCTTTGCCAGGCAGCACA 744. TGCAGGAATAGCAGGTATGAGGA 1055. DMSO 4 0 2 GTGGACGCCTACTGCCTGGACC 745. GCCCTGGCAGCCCATGGTAC 1056. DMSO 3 0 3AGGCAGTCATCGCCTTGCTA 746. GGTCCCACCTTCCCCTACAA 1057. DMSO 2 3 1Not optimized 3 1 2 CCCCAGCCCCCACCAGTTTC 747. CAGCCCAGGCCACAGCTTCA 1058.DMSO 1 4 1 Sequences and characteristics of genomic on- and off-targetsites for six RGNs targeted to endogenous human genes and primers andPCR conditions used to amplify these sites.

Determination of RGN-Induced On- and Off-Target Mutation Frequencies inHuman Cells

For U2OS.EGFP and K562 cells, 2×10⁵ cells were transfected with 250 ngof gRNA expression plasmid or an empty U6 promoter plasmid (for negativecontrols), 750 ng of Cas9 expression plasmid, and 30 ng of td-Tomatoexpression plasmid using the 4D Nucleofector System according to themanufacturer's instructions (Lonza). For HEK293 cells, 1.65×10⁵ cellswere transfected with 125 ng of gRNA expression plasmid or an empty U6promoter plasmid (for the negative control), 375 ng of Cas9 expressionplasmid, and 30 ng of a td-Tomato expression plasmid using LipofectamineLTX reagent according to the manufacturer's instructions (LifeTechnologies). Genomic DNA was harvested from transfected U2OS.EGFP,HEK293, or K562 cells using the QIAamp DNA Blood Mini Kit (QIAGEN),according to the manufacturer's instructions. To generate enough genomicDNA to amplify the off-target candidate sites, DNA from threeNucleofections (for U2OS.EGFP cells), two Nucleofections (for K562cells), or two Lipofectamine LTX transfections was pooled togetherbefore performing T7EI. This was done twice for each condition tested,thereby generating duplicate pools of genomic DNA representing a totalof four or six individual transfections. PCR was then performed usingthese genomic DNAs as templates as described above and purified usingAmpure XP beads (Agencourt) according to the manufacturer'sinstructions. T7EI assays were performed as previously described (Reyonet al., 2012, supra).

DNA Sequencing of NHEJ-Mediated Indel Mutations

Purified PCR products used for the T7EI assay were cloned into ZeroBlunt TOPO vector (Life Technologies) and plasmid DNAs were isolatedusing an alkaline lysis miniprep method by the MGH DNA Automation Core.Plasmids were sequenced using an M13 forward primer(5′-GTAAAACGACGGCCAG-3′ (SEQ ID NO:1059) by the Sanger method (MGH DNASequencing Core).

Example 1a. Single Nucleotide Mismatches

To begin to define the specificity determinants of RGNs in human cells,a large-scale test was performed to assess the effects of systematicallymismatching various positions within multiple gRNA/target DNAinterfaces. To do this, a quantitative human cell-based enhanced greenfluorescent protein (EGFP) disruption assay previously described (seeMethods above and Reyon et al., 2012, supra) that enables rapidquantitation of targeted nuclease activities (FIG. 2B) was used. In thisassay, the activities of nucleases targeted to a single integrated EGFPreporter gene can be quantified by assessing loss of fluorescence signalin human U2OS.EGFP cells caused by inactivating frameshiftinsertion/deletion (indel) mutations introduced by error pronenon-homologous end-joining (NHEJ) repair of nuclease-induceddouble-stranded breaks (DSBs) (FIG. 2B). For the studies described here,three ˜100 nt single gRNAs targeted to different sequences within EGFPwere used, as follows:

EGFP Site 1 (SEQ ID NO: 9) GGGCACGGGCAGCTTGCCGGTGG EGFP Site 2(SEQ ID NO: 10) GATGCCGTTCTTCTGCTTGTCGG EGFP Site 3 (SEQ ID NO: 11)GGTGGTGCAGATGAACTTCAGGG

Each of these gRNAs can efficiently direct Cas9-mediated disruption ofEGFP expression (see Example 1e and 2a, and FIG. 3E (top) and 3F (top)).

In initial experiments, the effects of single nucleotide mismatches at19 of 20 nucleotides in the complementary targeting region of threeEGFP-targeted gRNAs were tested. To do this, variant gRNAs weregenerated for each of the three target sites harboring Watson-Cricktransversion mismatches at positions 1 through 19 (numbered 1 to 20 inthe 3′ to 5′ direction; see FIG. 1 ) and the abilities of these variousgRNAs to direct Cas9-mediated EGFP disruption in human cells tested(variant gRNAs bearing a substitution at position 20 were not generatedbecause this nucleotide is part of the U6 promoter sequence andtherefore must remain a guanine to avoid affecting expression.)

For EGFP target site #2, single mismatches in positions 1-10 of the gRNAhave dramatic effects on associated Cas9 activity (FIG. 2C, middlepanel), consistent with previous studies that suggest mismatches at the5′ end of gRNAs are better tolerated than those at the 3′ end (Jiang etal., Nat Biotechnol 31, 233-239 (2013); Cong et al., Science 339,819-823 (2013); Jinek et al., Science 337, 816-821 (2012)). However,with EGFP target sites #1 and #3, single mismatches at all but a fewpositions in the gRNA appear to be well tolerated, even within the 3′end of the sequence. Furthermore, the specific positions that weresensitive to mismatch differed for these two targets (FIG. 2C, comparetop and bottom panels)—for example, target site #1 was particularlysensitive to a mismatch at position 2 whereas target site #3 was mostsensitive to mismatches at positions 1 and 8.

Example 1b. Multiple Mismatches

To test the effects of more than one mismatch at the gRNA/DNA interface,a series of variant gRNAs bearing double Watson-Crick transversionmismatches in adjacent and separated positions were created and theabilities of these gRNAs to direct Cas9 nuclease activity were tested inhuman cells using the EGFP disruption assay. All three target sitesgenerally showed greater sensitivity to double alterations in which oneor both mismatches occur within the 3′ half of the gRNA targetingregion. However, the magnitude of these effects exhibited site-specificvariation, with target site #2 showing the greatest sensitivity to thesedouble mismatches and target site #1 generally showing the least. Totest the number of adjacent mismatches that can be tolerated, variantgRNAs were constructed bearing increasing numbers of mismatchedpositions ranging from positions 19 to 15 in the 5′ end of the gRNAtargeting region (where single and double mismatches appeared to bebetter tolerated).

Testing of these increasingly mismatched gRNAs revealed that for allthree target sites, the introduction of three or more adjacentmismatches results in significant loss of RGN activity. A sudden dropoff in activity occurred for three different EGFP-targeted gRNAs as onemakes progressive mismatches starting from position 19 in the 5′ end andadding more mismatches moving toward the 3′ end. Specifically, gRNAscontaining mismatches at positions 19 and 19+18 show essentially fullactivity whereas those with mismatches at positions 19+18+17,19+18+17+16, and 19+18+17+16+15 show essentially no difference relativeto a negative control (FIG. 2F). (Note that we did not mismatch position20 in these variant gRNAs because this position needs to remain as a Gbecause it is part of the U6 promoter that drives expression of thegRNA.)

Additional proof of that shortening gRNA complementarity might lead toRGNs with greater specificities was obtained in the followingexperiment: for four different EGFP-targeted gRNAs (FIG. 2H),introduction of a double mismatch at positions 18 and 19 did notsignificantly impact activity. However, introduction of another doublemismatch at positions 10 and 11 then into these gRNAs results in nearcomplete loss of activity. Interestingly introduction of only the 10/11double mismatches does not generally have as great an impact onactivity.

Taken together, these results in human cells confirm that the activitiesof RGNs can be more sensitive to mismatches in the 3′ half of the gRNAtargeting sequence. However, the data also clearly reveal that thespecificity of RGNs is complex and target site-dependent, with singleand double mismatches often well tolerated even when one or moremismatches occur in the 3′ half of the gRNA targeting region.Furthermore, these data also suggest that not all mismatches in the halfof the gRNA/DNA interface are necessarily well tolerated.

In addition, these results strongly suggest that gRNAs bearing shorterregions of complementarity (specifically −17 nts) will be more specificin their activities. We note that 17 nts of specificity combined withthe 2 nts of specificity conferred by the PAM sequence results inspecification of a 19 bp sequence, one of sufficient length to be uniquein large complex genomes such as those found in human cells.

Example 1c. Off-Target Mutations

To determine whether off-target mutations for RGNs targeted toendogenous human genes could be identified, six single gRNAs that targetthree different sites in the VEGFA gene, one in the EMX1 gene, one inthe RNF2 gene, and one in the FANCF gene were used (Table 1 and TableA). These six gRNAs efficiently directed Cas9-mediated indels at theirrespective endogenous loci in human U2OS.EGFP cells as detected by T7Endonuclease I (T7EI) assay (Methods above and Table 1). For each ofthese six RGNs, we then examined dozens of potential off-target sites(ranging in number from 46 to as many as 64) for evidence ofnuclease-induced NHEJ-mediated indel mutations in U2OS.EGFP cells. Theloci assessed included all genomic sites that differ by one or twonucleotides as well as subsets of genomic sites that differ by three tosix nucleotides and with a bias toward those that had one or more ofthese mismatches in the 5′ half of the gRNA targeting sequence (TableB). Using the T7EI assay, four off-target sites (out of 53 candidatesites examined) for VEGFA site 1, twelve (out of 46 examined) for VEGFAsite 2, seven (out of 64 examined) for VEGFA site 3 and one (out of 46examined) for the EMX1 site (Table 1 and Table B) were readilyidentified. No off-target mutations were detected among the 43 and 50potential sites examined for the RNF2 or FANCF genes, respectively(Table B). The rates of mutation at verified off-target sites were veryhigh, ranging from 5.6% to 125% (mean of 40%) of the rate observed atthe intended target site (Table 1). These bona fide off-targets includedsequences with mismatches in the 3′ end of the target site and with asmany as a total of five mismatches, with most off-target sites occurringwithin protein coding genes (Table 1). DNA sequencing of a subset ofoff-target sites provided additional molecular confirmation that indelmutations occur at the expected RGN cleavage site (FIGS. 8A-C).

TABLE 1 On- and off-target mutations induced by RGNsdesigned to endogenous human genes SEQ Site IDIndel Mutation Frequency (%) ±SEM Target name Sequence NO: U2OS.EGFPHEK293 K562 Gene Target 1 T1 GGGTGGGGGGAGTTTGCTCCTGG 1059. 26.0 ± 2.910.5 ± 0.07 3.33 ± 0.42 VEGFA (VEGFA OT1-3 GG A TGG A GGGAGTTTGCTCCTGG1060. 25.7 ± 9.1 18.9 ± 0.77 2.93 ± 0.04 IGDCC3 Site 1) OT1-4 GGG A GGGT GGAGTTTGCTCCTGG 1061.  9.2 ± 0.8 8.32 ± 0.51 N.D. LOC116437 OT1-6 C GGG GG A GGGAGTTTGCTCCTGG 1062.  5.3 ± 0.2 3.67 ± 0.09 N.D. CACNA2D OT1-11GGG GA GGGG A AGTTTGCTCCTGG 1063. 17.1 ± 4.7 8.54 ± 0.16 N.D. Target 2T2 GACCCCCTCCACCCCGCCTCCGG 1064. 50.2 ± 4.9 38.6 ± 1.92 15.0 ± 0.25VEGFA (VEGFA OT2-1 GACCCCC C CCACCCCGCCCCCGG 1065. 14.4 ± 3.433.6 ± 1.17 4.10 ± 0.05 FMN1 Site 2) OT2-2 G GG CCCCTCCACCCCGCCTCTGG1066. 20.0 ± 6.2 15.6 ± 0.30 3.00 ± 0.06 PAX6 OT2-6 CTACCCCTCCACCCCGCCTCCGG 1067.  8.2 ± 1.4 15.0 ± 0.64 5.24 ± 0.22 PAPD7OT2-9 G C CCCC AC CCACCCCGCCTCTGG 1068. 50.7 ± 5.6 30.7 ± 1.447.05 ± 0.48 LAMA3 OT2-15 T ACCCCC CA CACCCCGCCTCTGG 1069.  9.7 ± 4.56.97 ± 0.10 1.34 ± 0.15 SPNS3 OT2-17 ACA CCCC C CCACCCCGCCTCAGG 1070.14.0 ± 2.8 12.3 ± 0.45 1.80 ± 0.03 OT2-19 ATT CCCC C CCACCCCGCCTCAGG1071. 17.0 ± 3.3 19.4 ± 1.35 N.D. HDLBP OT2-20 CC CC A CC CCCACCCCGCCTCAGG 1072.  6.1 ± 1.3 N.D. N.D. ABLIM1 OT2-23 CG CCC T C CCCACCCCGCCTCCGG 1073. 44.4 ± 6.7 28.7 ± 1.15 4.18 ± 0.37 CALY OT2-24 CTCCCC AC CCACCCCGCCTCAGG 1074. 62.8 ± 5.0 29.8 ± 1.08 21.1 ± 1.68 OT2-29TG CCCC TC CCACCCCGCCTCTGG 1075. 13.8 ± 5.2 N.D. N.D. ACLY OT2-34 AGGCCCC CA CACCCCGCCTCAGG 1076.  2.8 ± 1.5 N.D. N.D. Target 3 T3GGTGAGTGAGTGTGTGCGTGTGG 1077. 49.4 ± 3.8 35.7 ± 1.26 27.9 ± 0.52 VEGFA(VEGFA OT3-1 GGTGAGTGAGTGTGTGTG T GAGG 1078.  7.4 ± 3.4 8.97 ± 0.80 N.D.(abParts) Site 3) OT3-2 A GTGAGTGAGTGTGTGTG T GGGG 1079. 24.3 ± 9.223.9 ± 0.08  8.9 ± 0.16 MAX OT3-4 G C TGAGTGAGTGT A TGCGTGTGG 1080.20.9 ± 11.8 11.2 ± 0.23 N.D. OT3-9 GGTGAGTGAGTG C GTGCG G GTGG 1081. 3.2 ± 0.3 2.34 ± 0.21 N.D. TPCN2 OT3-17 G T TGAGTGA A TGTGTGCGTGAGG1082.  2.9 ± 0.2 1.27 ± 0.02 N.D. SLIT1 OT3-18 T GTG GGTGAGTGTGTGCGTGAGG 1083. 13.4 ± 4.2 12.1 ± 0.24 2.42 ± 0.07 COMDA OT3-20A G A GAGTGAGTGTGTGC A TGAGG 1084. 16.7 ± 3.5 7.64 ± 0.05 1.18 ± 0.01Target 4 T4 GAGTCCGAGCAGAAGAAGAAGGG 1085. 42.1 ± 0.4 26.0 ± 0.7010.7 ± 0.50 EMX1 (EMX1) OT4-1 GAGT TA GAGCAGAAGAAGAAAGG 1086. 16.8 ± 0.28.43 ± 1.32 2.54 ± 0.02 HCN1 Target 5 T5 GTCATCTTAGTCATTACCTGTGG 1087.26.6 ± 6.0 — — RNF2 (RNF2) Target 6 T6 GGAATCCCTTCTGCAGCACCAGG 1088.33.2 ± 6.5 — — FANCF (FANCF) ″OT″ indicates off-target sites (withnumbering of sites as in Table E). Mismatches from the on- target(within the 20 bp region to which the gRNA hybridizes) are highlightedas bold, underlined text. Mean indel mutation frequencies in U2OS.EGFP,HEK293, and K562 cells were determined as described in Methods. Genes inwhich sites were located (if any) are shown. All sites listed failed toshow any evidence of modification in cells transfected with Cas9expression plasmid and a control U6 promoter plasmid that did notexpress a functional gRNA. N.D. = none detected; — = not tested.

Example 1d. Off-Target Mutations in Other Cell Types

Having established that RGNs can induce off-target mutations with highfrequencies in U2OS.EGFP cells, we next sought to determine whetherthese nucleases would also have these effects in other types of humancells. We had chosen U2OS.EGFP cells for our initial experiments becausewe previously used these cells to evaluate the activities of TALENs 15but human HEK293 and K562 cells have been more widely used to test theactivities of targeted nucleases. Therefore, we also assessed theactivities of the four RGNs targeted to VEGFA sites 1, 2, and 3 and theEMX1 site in HEK293 and K562 cells. We found that each of these fourRGNs efficiently induced NHEJ-mediated indel mutations at their intendedon-target site in these two additional human cell lines (as assessed byT7EI assay) (Table 1), albeit with somewhat lower mutation frequenciesthan those observed in U2OS.EGFP cells. Assessment of the 24 off-targetsites for these four RGNs originally identified in U2OS.EGFP cellsrevealed that many were again mutated in HEK293 and K562 cells withfrequencies similar to those at their corresponding on-target site(Table 1). As expected, DNA sequencing of a subset of these off-targetsites from HEK293 cells provided additional molecular evidence thatalterations are occurring at the expected genomic loci (FIGS. 9A-C). Wedo not know for certain why in HEK293 cells four and in K562 cellseleven of the off-target sites identified in U2OS.EGFP cells did notshow detectable mutations. However, we note that many of theseoff-target sites also showed relatively lower mutation frequencies inU2OS.EGFP cells. Therefore, we speculate that mutation rates of thesesites in HEK293 and K562 cells may be falling below the reliabledetection limit of our T7EI assay (˜2-5%) because RGNs generally appearto have lower activities in HEK293 and K562 cells compared withU2OS.EGFP cells in our experiments. Taken together, our results inHEK293 and K562 cells provide evidence that the high-frequencyoff-target mutations we observe with RGNs will be a general phenomenonseen in multiple human cell types.

Example 1e. Titration of gRNA- and Cas9-Expressing Plasmid Amounts Usedfor the EGFP Disruption Assay

Single gRNAs were generated for three different sequences (EGFP SITES1-3, shown above) located upstream of EGFP nucleotide 502, a position atwhich the introduction of frameshift mutations via non-homologousend-joining can robustly disrupt expression of EGFP (Maeder, M. L. etal., Mol Cell 31, 294-301 (2008); Reyon, D. et al., Nat Biotech 30,460-465 (2012)).

For each of the three target sites, a range of gRNA-expressing plasmidamounts (12.5 to 250 ng) was initially transfected together with 750 ngof a plasmid expressing a codon-optimized version of the Cas9 nucleaseinto our U2OS.EGFP reporter cells bearing a single copy, constitutivelyexpressed EGFP-PEST reporter gene. All three RGNs efficiently disruptedEGFP expression at the highest concentration of gRNA-encoding plasmid(250 ng) (FIG. 3E (top)). However, RGNs for target sites #1 and #3exhibited equivalent levels of disruption when lower amounts ofgRNA-expressing plasmid were transfected whereas RGN activity at targetsite #2 dropped immediately when the amount of gRNA-expressing plasmidtransfected was decreased (FIG. 3E(top)).

The amount of Cas9-encoding plasmid (range from 50 ng to 750 ng)transfected into our U2OS.EGFP reporter cells was titrated and EGFPdisruption assayed. As shown in FIG. 3F (top), target site #1 tolerateda three-fold decrease in the amount of Cas9-encoding plasmid transfectedwithout substantial loss of EGFP disruption activity. However, theactivities of RGNs targeting target sites #2 and #3 decreasedimmediately with a three-fold reduction in the amount of Cas9 plasmidtransfected (FIG. 3F (top)). Based on these results, 25 ng/250 ng, 250ng/750 ng, and 200 ng/750 ng of gRNA-/Cas9-expressing plasmids were usedfor EGFP target sites #1, #2, and #3, respectively, for the experimentsdescribed in Examples 1a-1d.

The reasons why some gRNA/Cas9 combinations work better than others indisrupting EGFP expression is not understood, nor is why some of thesecombinations are more or less sensitive to the amount of plasmids usedfor transfection. Although it is possible that the range of off-targetsites present in the genome for these three gRNAs might influence eachof their activities, no differences were seen in the numbers of genomicsites that differ by one to six bps for each of these particular targetsites (Table C) that would account for the differential behavior of thethree gRNAs.

TABLE C Numbers of off-target sites in the human genome for six RGNstargeted to endogenous human genes and three RGNs targeted to the EGFPreporter gene Number of mismatches to on-target site Target Site 0 1 2 34 5 6 Target 1 1 1 4 32 280 2175 13873 (VEGFA Site 1) Target 2 1 0 2 35443 3889 17398 (VEGFA Site 2) Target 3 1 1 17 377 6028 13398 35517(VEGFA Site 3) Target 4 (EMX) 1 0 1 18 276 2309 15731 Target 5 (RNF2) 10 0 6 116 976 7443 Target 6 (FANCF) 1 0 1 18 271 1467 9551 EGFP TargetSite #1 0 0 3 10 156 1365 9755 EGFP Target Site #2 0 0 0 11 96 974 7353EGFP Target Site #3 0 0 1 14 165 1439 10361 Off-target sites for each ofthe six RGNs targeted to the VEGFA, RNF2, FANCF, and EMX1 genes and thethree RGNs targeted to EGFP Target Sites #1, #2 and #3 were identifiedin human genome sequence build GRCh37. Mismatches were only allowed forthe 20 nt region to which the gRNA anneals and not to the PAM sequence.

Example 2: Shortening gRNA Complementarity Length to Improve RGNCleavage Specificity

It was hypothesized that off-target effects of RGNs might be minimizedwithout compromising on-target activity simply by decreasing the lengthof the gRNA-DNA interface, an approach that at first might seemcounterintuitive. Longer gRNAs can actually function less efficiently atthe on-target site (see below and Hwang et al., 2013a; Ran et al.,2013). In contrast, as shown above in Example 1, gRNAs bearing multiplemismatches at their 5′ ends could still induce robust cleavage of theirtarget sites (FIGS. 2A and 2C-2F), suggesting that these nucleotidesmight not be required for full on-target activity. Therefore, it washypothesized that truncated gRNAs lacking these 5′ nucleotides mightshow activities comparable to full-length gRNAs (FIG. 2A). It wasspeculated that if the 5′ nucleotides of full-length gRNAs are notneeded for on-target activity then their presence might also compensatefor mismatches at other positions along the gRNA-target DNA interface.If this were true, it was hypothesized that gRNAs might have greatersensitivity to mismatches and thus might also induce substantially lowerlevels of Cas9-mediated off-target mutations (FIG. 2A).

Experimental Procedures

The following experimental procedures were used in Example 2.

Plasmid Construction

All gRNA expression plasmids were assembled by designing, synthesizing,annealing, and cloning pairs of oligonucleotides (IDT) harboring thecomplementarity region into plasmid pMLM3636 (available from Addgene) asdescribed above (Example 1). The resulting gRNA expression vectorsencode a ˜100 nt gRNA whose expression is driven by a human U6 promoter.The sequences of all oligonucleotides used to construct gRNA expressionvectors are shown in Table D. The Cas9 D1OA nickase expression plasmid(pJDS271) bearing a mutation in the RuvC endonuclease domain wasgenerated by mutating plasmid pJDS246 using a QuikChange kit (AgilentTechnologies) with the following primers: Cas9 D1OA sense primer5′-tggataaaaagtattctattggtttagccatcggcactaattccg-3′ (SEQ ID NO:1089);Cas9 D10A antisense primer5′-cggaattagtgccgatggctaaaccaatagaatacititiatcca-3′ (SEQ ID NO:1090).All the targeted gRNA plasmids and the Cas9 nickase plasmids used inthis study are available through the non-profit plasmid distributionservice Addgene (addgene.org/crispr-cas).

TABLE DSequences of oligonucleotides used to construct gRNA expression plasmidsEGFP Target Site 1 SEQ oligo- ID oligo- 20 19 18 17 16 15 14 13 12 11 109 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′)SEQ ID NO: G C A C G G G C A G C T T G C C G G ACACCGCACGGGCAGCTTGCCGGG1091 AAAACGGGGCAAGCTGCCCGTGCG 1180. G C A C G G G C A G C T T G C C G cACACCGCACGGGCAGCTTGCCGCG 1092 AAAACGCGGCAAGCTGCCCGTGCG 1181. G C A C G GG C A G C T T G C C c G ACACCGCACGGGCAGCTTGCCCGG 1093AAAACCGGGCAAGCTGCCCGTGCG 1182. G C A C G G G C A G C T T G C g G GACACCGCACGGGCAGCTTGCGGGG 1094 AAAACCCCGCAAGCTGCCCGTGCG 1183. G C A C G GG C A G C T T G g C G G ACACCGCACGGGCAGCTTGGCGGG 1095AAAACCCGCCAAGCTGCCCGTGCG 1184. G C A C G G G C A G C T T c C C G GACACCGCACGGGCAGCTTCCCGGG 1096 AAAACCCGGGAAGCTGCCCGTGCG 1185. G C A C G GG C A G C T a G C C G G ACACCGCACGGGCAGCTAGCCGGG 1097AAAACCCGGCTAGCTGCCCGTGCG 1186. G C A C G G G C A G C c T G C C G GACACCGCACGGGCAGCATGCCGGG 1098 AAAACCCGGCATGCTGCCCGTGCG 1187. G C A C G GG C A G g T T G C C G G ACACCGCACGGGCAGGTTGCCGGG 1099AAAACCCGGCAACCTGCCCGTGCG 1188. G C A C G G G C A c C T T G C C G GACACCGCACGGGCACCTTGCCGGG 1100 AAAACCCGGCAAGGTGCCCGTGCG 1189. G C A C G GG C t G C T T G C C G G ACACCGCACGGGCTGCTTGCCGGG 1101AAAACCCGGCAAGCAGCCCGTGCG 1190. G C A C G G G g A G C T T G C C G GACACCGCACGGGGAGCTTGCCGGG 1102 AAAACCCGGCAAGCTCCCCGTGCG 1191. G C A C G Gc C A G C T T G C C G G ACACCGCACGGCCAGCTTGCCGGG 1103AAAACCCGGCAAGCTGGCCGTGCG 1192. G C A C G c G C A G C T T G C C G GACACCGCACGCGCAGCTTGCCGGG 1104 AAAACCCGGCAAGCTGCGCGTGCG 1193. G C A C c GG C A G C T T G C C G G ACACCGCACCGGCAGCTTGCCGGG 1105AAAACCCGGCAAGCTGCCGGTGCG 1194. G C A g G G G C A G C T T G C C G GACACCGCAGGGGCAGCTTGCCGGG 1106 AAAACCCGGCAAGCTGCCCCTGCG 1195. G C t C G GG C A G C T T G C C G G ACACCGCTCGGGCAGCTTGCCGGG 1107AAAACCCGGCAAGCTGCCCGAGCG 1196. G g A C G G G C A G C T T G C C G GACACCGGACGGGCAGCTTGCCGGG 1108 AAAACCCGGCAAGCTGCCCGTCCG 1197. G C A C G GG C A G C T T G C C c c ACACCGCACGGGCAGCTTGCCCCG 1109AAAACGGGGCAAGCTGCCCGTGCG 1198. G C A C G G G C A G C T T G g g G GACACCGCACGGGCAGCTTGGGGGG 1110 AAAACCCCCCAAGCTGCCCGTGCG 1199. G C A C G GG C A G C T a c C C G G ACACCGCACGGGCAGCTACCCGGG 1111AAAACCCGGGTAGCTGCCCGTGCG 1200. G C A C G G G C A G g a T G C C G GACACCGCACGGGCAGGATGCCGGG 1112 AAAACCCGGCATCCTGCCCGTGCG 1201. G C A C G GG C t c C T T G C C G G ACACCGCACGGGCTCCTTGCCGGG 1113AAAACCCGGCAAGGAGCCCGTGCG 1202. G C A C G G c g A G C T T G C C G GACACCGCACGGCGAGCTTGCCGGG 1114 AAAACCCGGCAAGCTCGCCGTGCG 1203. G C A C c cG C A G C T T G C C G G ACACCGCACCCGCAGCTTGCCGGG 1115AAAACCCGGCAAGCTGCGGGTGCG 1204. G C A g G G G C A G C T T G C C G GACACCGCTGGGGCAGCTTGCCGGG 1116 AAAACCCGGCAAGCTGCCCCAGCG 1205. g t A C G GG C A G C T T G C C G G ACACCGGTCGGGCAGCTTGCCGGG 1117AAAACCCGGCAAGCTGCCCGACCG 1206. EGFP Target Site 2 SEQ oligo- ID oligo-20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′) G C C G T T C T T CT G C T T G T ACACCGCCGTTCTTCTGCTTGTG 1118 AAAACACAAGCAGAAGAACGGCG 1207.G C C G T T C T T C T G C T T G A ACACCGCCGTTCTTCTGCTTGAG 1119AAAACTCAAGCAGAAGAACGGCG 1208. G C C G T T C T T C T G C T T c TACACCGCCGTTCTTCTGCTTCTG 1120 AAAACAGAAGCAGAAGAACGGCG 1209. G C C G T T CT T C T G C T a G T ACACCGCCGTTCTTCTGCTAGTG 1121 AAAACACTAGCAGAAGAACGGCG1210. G C C G T T C T T C T G C a T G T ACACCGCCGTTCTTCTGCATGTG 1122AAAACACATGCAGAAGAACGGCG 1211. G C C G T T C T T C T G g T T G TACACCGCCGTTCTTCTGGTTGTG 1123 AAAACACAACCAGAAGAACGGCG 1212. G C C G T T CT T C T c C T T G T ACACCGCCGTTCTTCTCCTTGTG 1124 AAAACACAAGGAGAAGAACGGCG1213. G C C G T T C T T C a G C T T G T ACACCGCCGTTCTTCAGCTTGTG 1125AAAACACAAGCTGAAGAACGGCG 1214. G C C G T T C T T g T G C T T G TACACCGCCGTTCTTGTGCTTGTG 1126 AAAACACAAGCACAAGAACGGCG 1215. G C C G T T CT a C T G C T T G T ACACCGCCGTTCTACTGCTTGTG 1127 AAAACACAAGCAGTAGAACGGCG1216. G C C G T T C a T C T G C T T G T ACACCGCCGTTCATCTGCTTGTG 1128AAAACACAAGCAGATGAACGGCG 1217. G C C G T T g T T C T G C T T G TACACCGCCGTTGTTCTGCTTGTG 1129 AAAACACAAGCAGAACAACGGCG 1218. G C C G T a CT T C T G C T T G T ACACCGCCGTACTTCTGCTTGTG 1130 AAAACACAAGCAGAAGTACGGCG1219. G C C G a T C T T C T G C T T G T ACACCGCCGATCTTCTGCTTGTG 1131AAAACACAAGCAGAAGATCGGCG 1220. G C C c T T C T T C T G C T T G TACACCGCCCTTCTTCTGCTTGTG 1132 AAAACACAAGCAGAAGAAGGGCG 1221. G C g G T T CT T C T G C T T G T ACACCGCGGTTCTTCTGCTTGTG 1133 AAAACACAAGCAGAAGAACCGCG1222. G g C G T T C T T C T G C T T G T ACACCGGCGTTCTTCTGCTTGTG 1134AAAACACAAGCAGAAGAACGCCG 1223. G C C G T T C T T C T G C T T c aACACCGCCGTTCTTCTGCTTCAG 1135 AAAACTGAAGCAGAAGAACGGCG 1224. G C C G T T CT T C T G C a a G T ACACCGCCGTTCTTCTGCAAGTG 1136 AAAACACTTGCAGAAGAACGGCG1225. G C C G T T C T T C T c g T T G T ACACCGCCGTTCTTCTCGTTGTG 1137AAAACACAACGAGAAGAACGGCG 1226. G C C G T T C T T g a G C T T G TACACCGCCGTTCTTGAGCTTGTG 1138 AAAACACAAGCTCAAGAACGGCG 1227. G C C G T T Ca a C T G C T T G T ACACCGCCGTTCAACTGCTTGTG 1139 AAAACACAAGCAGTTGAACGGCG1228. G C C G T a g T T C T G C T T G T ACACCGCCGTAGTTCTGCTTGTG 1140AAAACACAAGCAGAACTACGGCG 1229. G C C c a T C T T C T G C T T G TACACCGCCCATCTTCTGCTTGTG 1141 AAAACACAAGCAGAAGATGGGCG 1230. G g g G T T CT T C T G C T T G T ACACCGGGGTTCTTCTGCTTGTG 1142 AAAACACAAGCAGAAGAACCCCG1231. EGFP Target Site 3 SEQ oligo- ID oligo- 20 19 18 17 16 15 14 13 1211 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G G T G C A G A T G A A C T T C AACACCGGTGCAGATGAACTTCAG 1143 AAAACTCTAGTTCATCTGCACCG 1232. G G T G C A GA T G A A C T T C t ACACCGGTGCAGATGAACTTCTG 1144 AAAACTCAAGTTCATCTGCACCG1233. G G T G C A G A T G A A C T T g A ACACCGGTGCAGATGAACTTGAG 1145AAAACTGTAGTTCATCTGCACCG 1234. G G T G C A G A T G A A C T a C AACACCGGTGCAGATGAACTACAG 1146 AAAACTGATGTTCATCTGCACCG 1235. G G T G C A GA T G A A C a T C A ACACCGGTGCAGATGAACATCAG 1147 AAAACTGAACTTCATCTGCACCG1236. G G T G C A G A T G A A g T T C A ACACCGGTGCAGATGAAGTTCAG 1148AAAACTGAAGATCATCTGCACCG 1237. G G T G C A G A T G A t C T T C AACACCGGTGCAGATGATCTTCAG 1149 AAAACTGAAGTACATCTGCACCG 1238. G G T G C A GA T G t A C T T C A ACACCGGTGCAGATGTACTTCAG 1150 AAAACTGAAGTTGATCTGCACCG1239. G G T G C A G A T c A A C T T C A ACACCGGTGCAGATCAACTTCAG 1151AAAACTGAAGTTCTTCTGCACCG 1240. G G T G C A G A a G A A C T T C AACACCGGTGCAGAAGAACTTCAG 1152 AAAACTGAAGTTCAACTGCACCG 1241. G G T G C A Gt T G A A C T T C A ACACCGGTGCAGTTGAACTTCAG 1153 AAAACTGAAGTTCATGTGCACCG1242. G G T G C A c A T G A A C T T C A ACACCGGTGCACATGAACTTCAG 1154AAAACTGAAGTTCATCAGCACCG 1243. G G T G C t G A T G A A C T T C AACACCGGTGCTGATGAACTTCAG 1155 AAAACTGAAGTTCATCTCCACCG 1244. G G T G g A GA T G A A C T T C A ACACCGGTGGAGATGAACTTCAG 1156 AAAACTGAAGTTCATCTGGACCG1245.   G G T c C A G A T G A A C T T C A ACACCGGTCCAGATGAACTTCAG 1157AAAACTGAAGTTCATCTGCTCCG 1246. G G a G C A G A T G A A C T T C AACACCGGAGCAGATGAACTTCAG 1158 AAAACTGAAGTTCATCTGCAGCG 1247. G c T G C A GA T G A A C T T C A ACACCGCTGCAGATGAACTTCAG 1159 AAAACTGAAGTTCATCTGCAGCG1248. G G T G C A G A T G A A C T T g t ACACCGGTGCAGATGAACTTGTG 1160AAAACACAAGTTCATCTGCACCG 1249. G G T G C A G A T G A A C a a C AACACCGGTGCAGATGAACAACAG 1161 AAAACTGTTGTTCATCTGCACCG 1250. G G T G C A GA T G A g g T T C A ACACCGGTGCAGATGATGTTCAG 1162 AAAACTGAACATCATCTGCACCG1251. G G T G C A G A T c t A C T T C A ACACCGGTGCAGATCTACTTCAG 1163AAAACTGAAGTAGATCTGCACCG 1252. G G T G C A G t a G A A C T T C AACACCGGTGCAGTAGAACTTCAG 1164 AAAACTGAAGTTCTACTGCACCG 1253. G G T G C t cA T G A A C T T C A ACACCGGTGCTCATGAACTTCAG 1165 AAAACTGAAGTTCATGAGCACCG1254. G G T c g A G A T G A A C T T C A ACACCGGTCGAGATGAACTTCAG 1166AAAACTGAAGTTCATCTCGACCG 1255. G c a G C A G A T G A A C T T C AACACCGCAGCAGATGAACTTCAG 1167 AAAACTGAAGTTCATCTGCTGCG 1256.Endogenous Target 1 (VEGFA Site 1 tru-gRNA): SEQ oligo- ID oligo- 20 1918 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G T G G G G G G A G T T T G C T C CACACCGTGGGGGGACTTTGCTCCG 1168 AAAACGGAGCAAACTCCCCCCACG 1257.Endogenous Target 3 (VEGFA Site 3 tru-gRNA): SEQ oligo- ID oligo- 20 1918 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G A G T G A G T G T G T G C G T GACACCGAGTGAGTGTGTGCGTGG 1169 AAAACCACGCACACACTCACTCG 1258.Endogenous Target 4 (EMX1 Site 1 tru-gRNA): SEQ oligo- ID oligo- 20 1918 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G T C C G A G C A G A A G A A G A AACACCGTCCGAGCAGAAGAAGAAG 1170 AAAACTTCTTCTTCTGCTCGGACG 1259.CTLA full-length gRNA SEQ oligo- ID oligo- 20 19 18 17 16 15 14 13 12 1110 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′)G C A G A T G T A G T G T T T C C A C A ACACCGCAGATGTAGTGTTTCCACAG 1171AAAACTGTGGAAACACTACATCTGCG 1260. CTLA thru-gRNA SEQ oligo- ID oligo- 2019 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′)NO: nucleotide 2 (5′ to 3′) G A T G T A G T G T T T C C A C AACACCGATGTAGTGTTTCCACAG 1172 AAAACTGTGGAAACACTACATCG 1261.VEGFA site 4 full-length gRNA SEQ oligo- ID oligo- 20 19 18 17 16 15 1413 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) T C C C T C T T T A G C C A G A G C C GACACCTCCCTCTTTAGCCAGAGCCGG 1173 AAAACCGGCTCTGGCCTAAAGGGAG 1262.EMX1 site 2 full-length gRNA SEQ oligo- ID oligo- 20 19 18 17 16 15 1413 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G C C G T T T G T A C T T T G T C C T CACACCGCCGTTTGTACTTTGCCTCG 1174 AAAACGAGGACAAAGTACAAACGGCG 1263.EMX1 site 2 tru-gRNA SEQ oligo- ID oligo- 20 19 18 17 16 15 14 13 12 1110 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′)G T T T G T A C T T T G T C C T C ACACCGTTTGTACTTTGTCCTCG 1175AAAACGAGGACAAAGTACAAACG 1264. EMX1 site 3 full-length gRNA SEQ oligo- IDoligo- 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′) F F F A A G A C T GA G G C T A C A T A ACACCGGGAAGACTGAGGCTACATAG 1176AAAACTATGTAGCCTCAGTCTTCCCG 1265. EMX1 site 3 tru-gRNA SEQ oligo- IDoligo- 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′) G A A G A C T G A GG C T A C A T A ACACCGAAGACTGAGGCTACATAG 1177 AAAACTATGTAGCCTCAGTCTTCG1266. EMX1 site 4 full-length gRNA SEQ oligo- ID oligo- 20 19 18 17 1615 14 13 12 11 10 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO:nucleotide 2 (5′ to 3′) G A G G C C C C C A G A G C A G C C A CACACCGAGGCCCCCAGAGCAGCCACG 1178 AAAACGTGGCTGCTCTGGGGGCCCTCG 1267.EMX1 site 4 tru-gRNA SEQ oligo- ID oligo- 20 19 18 17 16 15 14 13 12 1110 9 8 7 6 5 4 3 2 1 nucleotide 1 (5′ to 3′) NO: nucleotide 2 (5′ to 3′)G C C C C C A G A G C A G C C A C ACACCGCCCCCAGAGCAGCCACG 1179AAAACGTGGCTGCTCTGGGGGCG 1268.

Human Cell-Based EGFP Disruption Assay

U2OS.EGFP cells harboring a single-copy, integrated EGFP-PEST genereporter have been previously described (Reyon et al., 2012). Thesecells were maintained in Advanced DMEM (Life Technologies) supplementedwith 10% FBS, 2 mM GlutaMax (Life Technologies), penicillin/streptomycinand 400 μg/ml G418. To assay for disruption of EGFP expression, 2×10⁵U2OS.EGFP cells were transfected in duplicate with gRNA expressionplasmid or an empty U6 promoter plasmid as a negative control, Cas9expression plasmid (pJDS246) (Example 1 and Fu et al., 2013), and 10 ngof td-Tomato expression plasmid (to control for transfection efficiency)using a LONZA 4D-Nucleofector™, with SE solution and DN100 programaccording to the manufacturer's instructions. We used 25 ng/250 ng, 250ng/750 ng, 200 ng/750 ng, and 250 ng/750 ng of gRNA expressionplasmid/Cas9 expression plasmid for experiments with EGFP site #1, #2,#3, and #4, respectively. Two days following transfection, cells weretrypsinized and resuspended in Dulbecco's modified Eagle medium (DMEM,Invitrogen) supplemented with 10% (vol/vol) fetal bovine serum (FBS) andanalyzed on a BD LSRII flow cytometer. For each sample, transfectionsand flow cytometry measurements were performed in duplicate.

Transfection of Human Cells and Isolation of Genomic DNA

To assess the on-target and off-target indel mutations induced by RGNstargeted to endogenous human genes, plasmids were transfected intoU2OS.EGFP or HEK293 cells using the following conditions: U2OS.EGFPcells were transfected using the same conditions as for the EGFPdisruption assay described above. HEK293 cells were transfected byseeding them at a density of 1.65×10⁵ cells per well in 24 well platesin Advanced DMEM (Life Technologies) supplemented with 10% FBS and 2 mMGlutaMax (Life Technologies) at 37° C. in a CO₂ incubator. After 22-24hours of incubation, cells were transfected with 125 ng of gRNAexpression plasmid or an empty U6 promoter plasmid (as a negativecontrol), 375 ng of Cas9 expression plasmid (pJDS246) (Example 1 and Fuet al., 2013), and 10 ng of a td-Tomato expression plasmid, usingLipofectamine LTX reagent according to the manufacturer's instructions(Life Technologies). Medium was changed 16 hours after transfection. Forboth types of cells, genomic DNA was harvested two dayspost-transfection using an Agencourt DNAdvance genomic DNA isolation kit(Beckman) according to the manufacturer's instructions. For each RGNsample to be assayed, 12 individual 4D transfection replicates wereperformed, genomic DNA was isolated from each of these 12 transfections,and then these samples were combined to create two “duplicate” poolseach consisting of six pooled genomic DNA samples. Indel mutations werethen assessed at on-target and off-target sites from these duplicatesamples by T7EI assay, Sanger sequencing, and/or deep sequencing asdescribed below.

To assess frequencies of precise alterations introduced by HDR withssODN donor templates, 2×10⁵ U2OS.EGFP cells were transfected 250 ng ofgRNA expression plasmid or an empty U6 promoter plasmid (as a negativecontrol), 750 ng Cas9 expression plasmid (pJDS246), 50 pmol of ssODNdonor (or no ssODN for controls), and 10 ng of td-Tomato expressionplasmid (as the transfection control). Genomic DNA was purified threedays after transfection using Agencourt DNAdvance and assayed for theintroduction of a BamHI site at the locus of interest as describedbelow. All of these transfections were performed in duplicate.

For experiments involving Cas9 nickases, 2×10⁵ U2OS.EGFP cells weretransfected with 125 ng of each gRNA expression plasmid (if using pairedgRNAs) or 250 ng of gRNA expression plasmid (if using a single gRNA),750 ng of Cas9-D10A nickase expression plasmid (pJDS271), 10 ng oftd-Tomato plasmid, and (if performing HDR) 50 pmol of ssODN donortemplate (encoding the BamHI site). All transfections were performed induplicate. Genomic DNA harvested two days after transfection (ifassaying for indel mutations) or three days after transfection (ifassaying for HDR/ssODN-mediated alterations) using the AgencourtDNAdvance genomic DNA isolation kit (Beckman).

T7EI Assays for Quantifying Frequencies of Indel Mutations

T7EI assays were performed as previously described (Example 1 and Fu etal., 2013). In brief, PCR reactions to amplify specific on-target oroff-target sites were performed with Phusion high-fidelity DNApolymerase (New England Biolabs) using one of the two followingprograms: (1) Touchdown PCR program [(98° C., 10 s; 72-62° C., −1°C./cycle, 15 s; 72° C., 30 s)×10 cycles, (98° C., 10 s; 62° C., 15 s;72° C., 30 s)×25 cycles] or (2) Constant Tm PCR program [(98° C., 10 s;68° C. or 72° C., 15 s; 72° C., 30 s)×35 cycles], with 3% DMSO or 1 Mbetaine if necessary. All primers used for these amplifications arelisted in Table E. Resulting PCR products ranged in size from 300 to 800bps and were purified by Ampure XP beads (Agencourt) according to themanufacturer's instructions. 200 ng of purified PCR products werehybridized in 1×NEB buffer 2 in a total volume of 19 μl and denatured toform heteroduplexes using the following conditions: 95° C., 5 minutes;95 to 85° C., −2° C./s; 85 to 25° C., −0.1° C./s; hold at 4° C. 1 μl ofT7 Endonuclease I (New England Biolabs, 10 units/μ1) was added to thehybridized PCR products and incubated at 37° C. for 15 minutes. The T7EIreaction was stopped by adding 2 μl of 0.25 M EDTA solution and thereaction products were purified using AMPure XP beads (Agencourt) withelution in 20 μl 0.1×EB buffer (QIAgen). Reactions products were then toanalyzed on a QIAXCEL capillary electrophoresis system and thefrequencies of indel mutations were calculated using the same formula aspreviously described (Reyon et al., 2012).

TABLE E Mis- matches in non- Expected target Watson- Off-Target comparedWatson- Crick Publi- Sequenes SEQ to on- Actual Target Forward SEQReverse SEQ Crick Trans cation (Expected)- ID target in U2OS.EGFP PCR IDPCR ID PCR Trans- -ver- Tran- ID HS GRCh37 NO: site cells Primer NO:Primer NO: Conditions versions sions sitions Target 1 GGGTGGGGGGAG 1269.0 TCCAGATGGCACA 1270. AGGGAGCA 1271. DMSO TTTGCTCCTGG TTGTCAG GGAAAGTGAGGT OT1-1 GGGTGGGGGGAG 1272. 1 GGGGCCCACTCTT 1273. ACCCAGAC 1274. No 00 1 TTTGCCCCAGG CTTCCAT TCCTGGTG DMSO TGGC OT1-2 GCGTGGGGGGTG 1275. 2GCTAAGCAGAGAT 1276. ACCACCCT 1277. DMSO 2 0 0 TTTGCTCCCGG GCCTATGCCTTCCCCCA GAAA OT1-3 GGATGGAGGGAG 1278. 2 ACCCCACAGCCAG 1279. GAATCACT1280. DMSO 0 0 2 TTTGCTCCTGG GTTTTCA GCACCTGG CCATC OT1-4 GGGAGGGTGGAG1281. 2 TGCGGCAACTTCA 1282. TAAAGGGC 1283. DMSO 1 1 0 TTTGCTCCTGGGACAACC GTGCTGGG AGAG OT1-5 GGGTGGGTGGAG 1284. 2 GCATGTCAGGATC 1285.TGCAGGGC 1286. DMSO 0 2 0 TTTGCTACTGG TGACCCC CATCTTGT GTGT OT1-6CGGGGGAGGGAG 1287. 3 CCACCACATGTTC 1288. CTGGGTCT 1289. DMSO 1 1 1TTTGCTCCTGG TGGGTGC GTTCCCTG TGGG OT1-7 GAGTGGGTGGAG 1290. 3GGCTCTCCCTGCC 1291. GCAGGTCA 1292. DMSO 0 2 1 TTTGCTACAGG CTAGTTTAGTTGGAA CCCG OT1-8 GGGAGGGGAGAG 1293. 3 GGGGCTGAGAACA 1294. AGATTTGT1295. DMSO 1 0 2 TTTGTTCCAGG CATGAGATGCA GCACTGCC TGCCT OT1-9GGGAGGGGGCAG 1296. 3 CCCGACCTCCGCT 1297. GGACCTCT 1298. DMSO 2 1 0GTTGCTCCAGG CCAAAGC GCACACCC TGGC OT1-10 GGGAGGGGGGAG 1299. 3TGCAAGGTCGCAT 1300. CAGGAGGG 1301. DMSO 1 1 1 TGTGTTCCGGG AGTCCCAGGAAGTGT GTCC OT1-11 GGGGAGGGGAAG 1302. 3 GCCCATTCTTTTT 1303. GAGAGCAA1304. DMSO 0 1 2 TTTGCTCCTGG GCAGTGGA GTTTGTTC CCCAGG OT1-12GGGGGTGGGGAC 1305. 3 GCCCCCAGCCCCT 1306. GCTGCTGG 1307. DMSO 1 2 0TTTGCTCCAGG CTGTTTC TAGGGGAG CTGG OT1-13 GGGTCGGGGGAG 1308. 3CGGCTGCCTTCCC 1309. GGGTGACG 1310. 72 C. 1 2 0 TGGGCTCCAGG TGAGTCCCTTGCCAT Anneal, GAGC 3% DMSO OT1-14 GGGTGGCTGGAG 1311. 3 TGACCCTGGAGTA1312. GCTGAGAC 1313. 72 C. 2 1 0 TTTGCTGCTGG CAAAATGTTCCCA AACCAGCCAnneal, CAGCT 3% DMSO OT1-15 GGGTGGGGGGTG 1314. 3 TGCCTCCACCCTT 1315.GCAGCCGA 1316. DMSO 1 0 2 CCTGCTCCAGG AGCCCCT TCCACACT GGGG OT1-16GGTTGAGGGGAG 1317. 3 AACTCAGGACAAC 1318. CCCAGGAG 1319. DMSO 0 1 2TCTGCTCCAGG ACTGCCTGT CAGGGTAC AATGC OT1-17 GTGTGGGTGGCG 1320. 3TCCTCCTTGGAGA 1321. CCTTGGAA 1322. DMSO 0 3 0 TTTGCTCCAGG GGGGCCCGGGGCCTT GGTGG OT1-18 AGGTGGTGGGAG 1323. 4 CCGAGGGCATGGG 1324. GGCTGCTG1325. DMSO 0 1 3 CTTGTTCCTGG CAATCCT CGAGTTGC CAAC OT1-19 AGTTTGGGGGAG1326. 4 TGCTTTGCATGGG 1327. GGGTTGCT 1328. DMSO 0 2 2 TTTGCCCCAGGGTCTCAGACA TGCCCTCT GTGT OT1-20 ATGTGTGGGGAA 1329. 4 AGCTCCTTCTCAT 1330.CACAGAAG 1331. DMSO 0 2 2 TTTGCTCCAGG TTCTCTTCTGCTG GATGTGTG T CAGGTTOT1-21 CAGTGGGGGGAG 1332. 4 AGCAGACACAGGT 1333. GGTCAGGT 1334. DMSO 1 12 CTTTCTCCTGG GAATGCTGCT GTGCTGCT AGGCA OT1-22 GAGGGGGAGCAG 1335. 4CCTGTGGGGCTCT 1336. ACTGCCTG 1337. No 1 1 2 TTTGCTCCAGG CAGGTGC CCAAAGTGDMSO GGTGT TD OT1-23 GGAGGAGGGGAG 1338. 4 AGCTGCACTGGGG 1339. TGCCGGGT1340. DMSO 0 1 3 TCTGCTCCAGG AATGAGT AATAGCTG GCTT OT1-24 GGAGGGGGGGCT1341. 4 CCAGCCTGGGCAA 1342. GGGGGCTT 1343. 72 C. 0 3 1 TTTGCTCCAGGCAAAGCG CCAGGTCA Anneal, CAGG 3% DMSO, 6% DMSO OT1-25 GGGCAAGGGGAG 1344.4 TACCCCCACTGCC 1345. ACAGGTCC 1346. DMSO 0 1 3 GTTGCTCCTGG CCATTGCATGCTTAG CAGAGGG OT1-26 GGGTGATTGAAG 1347. 4 GGGTGATTGAAGTTACGGATTCACGAC 1348. CCGAGTCC 1349. DMSO 0/1 2 2 TTTGCTCCAGGTGCTCCAGG (SEQ GGAGGTGC GTGGCAGA ID NO: 2225) GAGC GGGTGATTGAAGTTTGCTGCAGG (SEQ ID NO: 2226) OT1-27 GGGTGTGGGGTC 1350. 4 TGTGGTTGAAGTA1351. TGGCCCAA 1352. DMSO 3 1 0 ATTGCTCCAGG GGGGACAGGT TTGGAAGT GATTTCGTOT1-28 GGTGGGGGTGGG 1353. 4 TGGGATGGCAGAG 1354. GGCCCAAT 1355. DMSO 0 31 TTTGCTCCTGG TCATCAACGT CGGTAGAG GATGCA OT1-29 GTGGGGGTAGAG 1356. 4ATGGGGCGCTCCA 1357. TGCACCCA 1358. DMSO 0 3 1 TTTGCTCCAGG GTCTGTGCACAGCCA GCAA OT1-30 TAGTGGAGGGAG 1359. 4 GGGGAGGGAGGAC 1360. AATTAGCT1361. 72 C. 0 1 3 CTTGCTCCTGG CAGGGAA GGGCGCGG Anneal, TGGT 3% DMSOOT1-31 TGCTCGGGGGAG 1362. 4 ATCCCGTGCAGGA 1363. CAGGCGGC 1364. DMSO 3 10 TTTGCACCAGG AGTCGCC CCCTTGAG GAAT OT1-32 TGGAGAGGGGAG 1365. 4CCCCAACCCTTTG 1366. TGAGGAGA 1367. DMSO 1 2 1 TTGGCTCCTGG CTCAGCGACACCACA GGCAGA OT1-33 TGGTGTTGGGAG 1368. 4 ATCGACGAGGAGG 1369. CCCCTCAC1370. DMSO 0 3 1 TCTGCTCCAGG GGGCCTT TCAAGCAG GCCC OT1-34 TTGGGGGGGCAG1371. 4 TGCTCAAGGGGCC 1372. CAGGGGCA 1373. No 1 3 0 TTTGCTCCTGG TGTTCCAGTGGCAGG DMSO AGTC OT1-35 AAGTAAGGGAAG 1374. 5 TGCCTGGCACGCA 1375.GGGAAGGG 1376. DMSO 0 0 5 TTTGCTCCTGG GTAGGTG GGAACAGG TGCA OT1-36AGAAGAGGGGAT 1377. 5 Not optimized 1 1 3 TTTGCTCCTGG OT1-37 ATCTGGGGTGAT1378. 5 ACCTGGGCTTGCC 1379. GCTGCTCG 1380. DMSO 1 3 1 TTTGCTCCTGGACTAGGG CAGTTAAG CACCA OT1-38 CTCTGCTGGGAG 1381. 5 GTGGCCGGGCTAC 1382.GGTTCCAC 1383. DMSO 3 2 0 TTTGCTCCTGG TGCTACC AAGCTGGG GGCA OT1-39CTGGTGGGGGAG 1384. 5 Not optimized 1 3 1 CTTGCTCCAGG OT1-40 CTTTCGGGGGAG1385. 5 GCAAGAGGCGGAG 1386. AGAGTCAT 1387. DMSO 2 3 0 TTTGCGCCGGGGAGACCC CCATTTCC TGGGGGC OT1-41 CTTTGGGGTTAG 1388. 5 GGGGTCAGTGGTG 1389.AGGGAATC 1390. 1M 1 4 0 TTTGCTCCTGG ATATCCCCCT CTTTTTCC betaine,ATTGCTTG TD TTT OT1-42 GCTCTGGGGTAG 1391. 5 AGAGAGGCCACGT 1392. GCCTCCCC1393. DMSO 1 3 1 TTTGCTCCAGG GGAGGGT TCCTCCTT CCCA OT1-43 GTCTCTCGGGAG1394. 5 GACAGTGCCTTGC 1395. TCTGACCG 1396. DMSO 3 2 0 TTTGCTCCGGGGATGCAC GTATGCCT GACG OT1-44 TCCTGAGGGCAG 1397. 5 TGTGTGAACGCAG 1398.TGGTCTAG 1399. DMSO 3 1 1 TTTGCTCCAGG CCTGGCT TACTTCCT CCAGCCTT OT1-45TCTTTGGGAGAG 1400. 5 GGTTCTCCCTTGG 1401. CCCACTGC 1402. DMSO 1 3 1TTTGCTCCAGG CTCCTGTGA TCCTAGCC CTGC OT1-46 ACAACTGGGGAG 1403. 6TGAAGTCAACAAT 1404. AGCTTTGG 1405. DMSO 3 1 2 TTTGCTCCTGG CTAAGCTTCCACCTAGTTGGA T GTCTTTGA AGG OT1-47 ACAAGGTGGAAG 1406. 6 TGATTGGGCTGCA 1407.GCACAGCC 1408. DMSO 2 1 3 TTTGCTCCTGG GTTCATGTACA TGCCCTTG GAAG OT1-48ACATAGAAGGAG 1409. 6 TCCATGGGCCCCT 1410. AGCGGCTT 1411. DMSO 1 0 5TTTGCTCCAGG CTGAAAGA CTGCTTCT GCGA OT1-49 AGACCCAGGGAG 1412. 6GCGGTTGGTGGGG 1413. GAGTTCCT 1414. DMSO 2 0 4 TTTGCTCCCGG TTGATGCCCTCCCGC CAGT OT1-50 AGACCCAGGGAG 1415. 6 AGGCAAGATTTTC 1416. GCTTTTGC1417. DMSO 2 0 4 TTTGCTCCCGG CAGTGTGCAAGA CTGGGACT CCGC OT1-51CACGGAGGGGTG 1418. 6 GCTGCTGGTCGGG 1419. GCTCTGTC 1420. No 3 1 2TTTGCTCCTGG CTCTCTG CCACTTCC DMSO CCTGG TD OT1-52 CAGAGCTTGGAG 1421. 6GCTGCGAGGCTTC 1422. CGCCCCTA 1423. DMSO 3 2 1 TTTGCTCCAGG CGTGAGAGAGCTAAG GGGGT OT1-53 CTATTGATGGAG 1424. 6 CCAGGAGCCTGAG 1425. AGGGCTAG1426. DMSO 1 3 2 TTTGCTCCTGG AGCTGCC GACTGCAG TGAGC OT1-54 CTTTCTAGGGAG1427. 6 CTGTGCTCAGCCT 1428. GCCTGGGG 1429. DMSO 2 3 1 TTTGCTCCTGGGGGTGCT CTGTGAGT AGTTT OT1-55 GCCATGCTGGAG 1430. 6 AGCTCGCGCCAGA 1431.ACTTGGCA 1432. 72 C. 4 2 0 TTTGCTCCAGG TCTGTGG GGCTGAGG Anneal, CAGG 3%DMSO 1433. 1434. 1435. Target 2 GACCCCCTCCAC 1436. 0 AGAGAAGTCGAGG 1437.CAGCAGAA 1438. DMSO CCCGCCTCCGG AAGAGAGAG AGTTCATG GTTTCG OT2-1GACCCCCCCCAC 1439. 2 TGGACAGCTGCAG 1440. ACTGATCG 1441. DMSO 0 0 2CCCGCCCCCGG TACTCCCTG ATGATGGC CTATGGGT OT2-2 GGGCCCCTCCAC 1442. 2CAAGATGTGCACT 1443. GCAGCCTA 1444. DMSO 1 0 1 CCCGCCTCTGG TGGGCTATTGTCTCC TGGT OT2-3 AACCCCATCCAC 1445. 3 GTCCAGTGCCTGA 1446. AGCATCAT1447. DMSO 1 1 1 CCGGCCTCAGG CCCTGGC GCCTCCAG CTTCA OT2-4 CACCCCCTCAAC1448. 3 GCTCCCGATCCTC 1449. GCAGCTCC 1450. DMSO 1 2 0 ACCGCCTCAGGTGCCACC CACCACCC TCAG OT2-5 CACCCCCTCCCC 1451. 3 GGGGACAGGCAGG 1452.GTGCGTGT 1453. DMSO 1 1 1 TCCGCCTCAGG CAAGGAG CCGTTCAC CCCT OT2-6CTACCCCTCCAC 1454. 3 AAGGGGCTGCTGG 1455. CGTGATTC 1456. DMSO 2 1 0CCCGCCTCCGG GTAGGAC GAGTTCCT GGCA OT2-7 GACCCGCCCCGC 1457. 3GACCCTCAGGAAG 1458. CTGCGAGA 1459. 1M 1 0 2 CCCGCCTCTGG CTGGGAG TGCCCCAAbetaine, ATCG TD OT2-8 GATCGACTCCAC 1460. 3 CCGCGGCGCTCTG 1461. TGCTGGGA1462. DMSO 1 1 1 CCCGCCTCTGG CTAGA TTACAGGC GCGA OT2-9 GCCCCCACCCAC1463. 3 CCAGGTGGTGTCA 1464. TGCCTGGC 1465. DMSO 0 2 1 CCCGCCTCTGGGCGGAGG CCTCTCTG AGTCT OT2-10 GCCCCGCTCCTC 1466. 3 CGACTCCACGGCG 1467.CAGCGCAG 1468. 1M 2 1 0 CCCGCCTCCGG TCTCAGG TCCAGCCC betaine, GATG TDOT2-11 GGCCCCCTCCAC 1469. 3 CTTCCCTCCCCCA 1470. GCTACAGG 1471. DMSO 1 11 CAGGCCTCAGG GCACCAC TTGCACAG TGAGAGGT OT2-12 GGCCCCCTCCTC 1472. 3CCCCGGGGAGTCT 1473. CCCAGCCG 1474. 72 C. 1 0 2 CTCGCCTCTGG GTCCTGATTCCAGGT Anneal, CTTCC 3% DMSO OT2-13 GGCGCCCTCCAC 1475. 3 GAAGCGCGAAAAC1476. TCCAGGGT 1477. DMSO 1 0 2 CCTGCCTCGGG CCGGCTC CCTTCTCG GCCC OT2-14GTCCTCCACCAC 1478. 3 AGGGTGGTCAGGG 1479. CATGGGGC 1480. DMSO 2 0 1CCCGCCTCTGG AGGCCTT TCGGACCT CGTC OT2-15 TACCCCCCACAC 1481. 3GGGAAGAGGCAGG 1482. TGCCAGGA 1483. 72 C. 0 2 1 CCCGCCTCTGG GCTGTCGAGGAAGCT Anneal, GGCC 3% DMSO OT2-16 AACCCATTCCAC 1484. 4 GAGTGACGATGAG1485. CCCTTAGC 1486. 68 C. 0 1 3 CCTGCCTCAGG CCCCGGG TGCAGTCG Anneal,CCCC 3% DMSO OT2-17 ACACCCCCCCAC 1487. 4 CCCATGAGGGGTT 1488. TGAAGATG1489. DMSO 0 2 2 CCCGCCTCAGG TGAGTGC GGCAGTTT GGGG OT2-18 AGCCCCCACCTC1490. 4 CACCTGGGGCATC 1491. ACTGGGGT 1492. DMSO 2 0 2 CCCGCCTCGGGTGGGTGG TGGGGAGG GGAT OT2-19 ATTCCCCCCCAC 1493. 4 TCATGATCCCCAA 1494.CCATTTGT 1495. DMSO 1 0 3 CCCGCCTCAGG AAGGGCT GCTGATCT GTGGGT OT2-20CCCCACCCCCAC 1496. 4 TGGTGCCCAGAAT 1497. AGGAAATG 1498. DMSO 1 2 1CCCGCCTCAGG AGTGGCCA TGTTGTGC CAGGGC OT2-21 CCCCCCCACCAC 1499. 4GCCTCAGACAACC 1500. GCCAAGTG 1501. No 2 1 1 CCCGCCCCGGG CTGCCCC TTACTCATDMSO CAAGAAAG TD TGG OT2-22 CCCCCCCCCCCC 1502. 4 GCCGGGACAAGAC 1503.TCCCGAAC 1504. DMSO 1 2 1 CCCGCCTCAGG TGAGTTGGG TCCCGCAA AACG OT2-23CGCCCTCCCCAC 1505. 4 TGCTGCAGGTGGT 1506. CTGGAACC 1507. No 1 0 3CCCGCCTCCGG TCCGGAG GCATCCTC DMSO CGCA TD OT2-24 CTCCCCACCCAC 1508. 4ACACTGGTCCAGG 1509. GGCTGTGC 1510. DMSO 2 1 1 CCCGCCTCAGG TCCCGTCTCTTCCGAT GGAA OT2-25 CTCTCCCCCCAC 1511. 4 CTCTCCCCCCACCC ATCGCGCCCAAAG1512. AGGCTTCT 1513. DMSO 3 0 2 CCCGCCTCTGG CCCCTCTGG (SEQ CACAGGTGGAAAAGT ID NO: 2227) CCTCAATG CA OT2-26 GCCTCTCTGCAC 1514. 4Not optimized 1 1 2 CCCGCCTCAGG OT2-27 GTCACTCCCCAC 1515. 4CCCTCATGGTGGT 1516. AGCCACAC 1517. DMSO 1 1 2 CCCGCCTCTGG CTTACGGCAATCTTTCT GGTAGGG OT2-28 TGCCCCCTCCCC 1518. 4 TGCGTCGCTCATG 1519.AGGGTGGG 1520. DMSO 0 3 1 CCAGCCTCTGG CTGGGAG GTGTACTG GCTCA OT2-29TGCCCCTCCCAC 1521. 4 GAGCTGAGACGGC 1522. TGGCCTTG 1523. 1M 0 1 3CCCGCCTCTGG ACCACTG AACTCTTG betaine, GGCT TD OT2-30 TTCCCCTTCCAC 1524.4 Not optimized 1 2 1 CCAGCCTCTGG OT2-31 TTCTCCCTCCTC 1525. 4AGTGAGAGTGGCA 1526. CAGTAGGT 1527. DMSO 2 1 1 CCCGCCTCGGG CGAACCAGGTCCCTT CCGC OT2-32 ACCCTCGCCCAC 1528. 5 Not optimized 1 1 3CCCGCCTCAGG OT2-33 AGCCAACCCCAC 1529. 5 GGGAGAACCTTGT 1530. AAGCCGAA1531. DMSO 0 2 3 CCCGCCTCTGG CCAGCCT AAGCTGGG CAAA OT2-34 AGGCCCCCACAC1532. 5 CTTCCCAGTGTGG 1533. ACACAGTC 1534. DMSO 1 1 3 CCCGCCTCAGGCCCGTCC AGAGCTCC GCCG OT2-35 AGGCCCCCCCGC 1535. 5 Not optimized 1 0 4CCCGCCTCAGG OT2-36 ATCTGCCACCAC 1536. 5 CTGAGAGGGGGAG 1537. TCGACTGG1538. 68 C. 3 0 2 CCCGCCTCCGG GGGGAGG TCTTGTCC Anneal, TCCCA 3% DMSOOT2-37 CATCTTCCCCAC 1539. 5 CAGCCTGCTGCAT 1540. TGCAGCCA 1541. 1M 1 0 4CCCGCCTCTGG CGGAAAA AGAGAAAA betaine, AGCCT TD OT2-38 CTTTCCCTCCAC 1542.5 TCCCTCTGACCCG 1543. ACCCGACT 1544. DMSO 2 1 2 CCAGCCTCTGG GAACCCATCCTCCCC ATTGC OT2-39 GTCGAGGTCCAC 1545. 5 TGGGGGTTGCGTG 1546. GCCAGGAG1547. DMSO 4 1 0 CCCGCCTCAGG CTTGTCA GACACCAG GACC OT2-40 GTCGAGGTCCAC1548. 5 ATCAGGTGCCAGG 1549. GGCCTGAG 1550. DMSO 4 1 0 CCCGCCTCAGGAGGACAC AGTGGAGA GTGG OT2-41 TCAGACCTCCAC 1551. 5 Not optimized 1 4 0CCCGCCTCAGG OT2-42 TGCAACCTCCTC 1552. 5 TGAGCCACATGAA 1553. ACCTCTCC1554. DMSO 1 3 1 CCCGCCTCGGG TCAAGGCCTCC AAGTCTCA GTAACTCT CT OT2-43ACCAGTCTGCAC 1555. 6 GGTCCCTCTGTGC 1556. CTTTGGTG 1557. DMSO 2 2 2CCCGCCTCTGG AGTGGAA GACCTGCA CAGC OT2-44 ACTACCCACCTC 1558. 6GCGAGGCTGCTGA 1559. GCTGGGAC 1560. DMSO 2 2 2 CCCGCCTCAGG CTTCCCTTACAGACA TGTGCCA OT2-45 ATTTCCCCCCCC 1561. 6 ATTTCCTCCCCCCCATTGCAGGCGTGT 1562. AAATCCTG 1563. DMSO 1 1 5 CCCGCCTCAGG C-CCTCAGG (SEQCCAGGCA CATGGTGA ID NO:2228) TGGGAGT OT2-46 CCACCATCCCAC 1564. 6TGCTCTGCCATTT 1565. ACAGCCTC 1566. DMSO 1 3 2 CCCGCCTCTGG ATGTCCTATGAACTTCTCCAT T GACTGAGC OT2-47 CCCAAGCCCCAC 1567. 6 TCCGCCCAAACAG 1568.GCGGTGGG 1569. DMSO 2 3 1 CCCGCCTCGGG GAGGCAG GAAGCCAT TGAG OT2-48CCGCGCTTCCGC 1570. 6 GGGGGTCTGGCTC 1571. CCTGTCGG 1572. DMSO 3 1 2CCCGCCTCTGG ACCTGGA GAGAGTGC CTGC OT2-49 CCTGCCATGCAC 1573. 6TCCTGGTTCATTT 1574. ACTCCAGA 1575. DMSO 3 2 1 CCCGCCTCAGG GCTAGAACTCTGGTGCAACCA A GGGCT OT2-50 CTGCCTCCTCAC 1576. 6 CGTGTGGTGAGCC 1577.GCTTCACC 1578. DMSO 3 0 3 CCCGCCTCAGG TGAGTCT GTAGAGGC TGCT OT2-51TCTTCTTTCCAC 1579. 6 AGGCCCTGATAAT 1580. TCAGTGAC 1581. DMSO 0 2 4CCCGCCTCAGG TCATGCTACCAA AACCTTTT GTATTCGG CA OT2-52 TTGACCCCCCGC 1582.6 Not optimized 2 2 2 CCCGCCTCAGG Target 3 GGTGAGTGAGTG 1583. 0TCCAGATGGCACA 1584. AGGGAGCA 1585. DMSO TGTGCGTGTGG TTGTCAG GGAAAGTGAGGT OT3-1 GGTGAGTGAGTG 1586. 1 GCAGGCAAGCTGT 1587. CACCGACA 1588. DMSO0 0 1 TGTGTGTGAGG CAAGGGT CACCCACT CACC OT3-2 AGTGAGTGAGTG 1589. 2GAGGGGGAAGTCA 1590. TACCCGGG 1591. DMSO 0 0 2 TGTGTGTGGGG  CCGACAACCGTCTGT TAGA OT3-3 AGTGTGTGAGTG 1592. 2 GACACCCCACACA 1593. TGAATCCC1594. DMSO 1 0 1 TGTGCGTGTGG CTCTCATGC TTCACCCC CAAG OT3-4 GCTGAGTGAGTG1595. 2 TCCTTTGAGGTTC 1596. CCAATCCA 1597. DMSO 1 0 1 TATGCGTGTGGATCCCCC GGATGATT CCGC OT3-5 GGTGAGTCAGTG 1598. 2 CAGGGCCAGGAAC 1599.GGGAGGTA 1600. DMSO 1 1 0 TGTGAGTGAGG ACAGGAA TGTGCGGG AGTG OT3-6GGTGAGTGAGAG 1601. 2 TGCAGCCTGAGTG 1602. GCCCAGGT 1603. DMSO 1 0 1TGTGTGTGTGG AGCAAGTGT GCTAAGCC CCTC OT3-7 GGTGAGTGAGTG 1604. 2TACAGCCTGGGTG 1605. TGTGTCAT 1606. 1M 1 1 0 AGTGAGTGAGG ATGGAGC GGACTTTCbetaine, CCATTGT TD OT3-8 GGTGAGTGAGTG 1607. 2 GGCAGGCATTAAA 1608.TCTCCCCC 1609. DMSO 1 1 0 AGTGAGTGAGG CTCATCAGGTCC AAGGTATC AGAGAGCTOT3-9 GGTGAGTGAGTG 1610. 2 GGGCCTCCCTGCT 1611. GCTGCCGT 1612. DMSO 0 1 1CGTGCGGGTGG GGTTCTC CCGAACCC AAGA OT3-10 GGTGAGTGTGTG 1613. 2ACAAACGCAGGTG 1614. ACTCCGAA 1615. DMSO 1 1 0 TGTGAGTGTGG GACCGAAAATGCCCC GCAGT OT3-11 GGTGAGTGTGTG 1616. 2 AGGGGAGGGGACA 1617. TTGAGAGG1618. DMSO 1 0 1 TGTGCATGTGG TTGCCT GTTCAGTG GTTGC OT3-12 GGTGTGTGAGTG1619. 2 CTAATGCTTACGG 1620. AGCCAACG 1621. DMSO 1 0 1 TGTGTGTGTGGCTGCGGG GCAGATGC AAAT OT3-13 GGTGTGTGTGTG 1622. 2 GAGCGAAGTTAAC 1623.CACACATG 1624. 68 C., 2 0 0 TGTGCGTGCGG CCACCGC CACATGCC 3% CCTG DMSOOT3-14 GGTGTGTGTGTG 1625. 2 GCATGTGTCTAAC 1626. TCCCCCAT 1627. DMSO 2 00 TGTGCGTGTGG TGGAGACAATAGC ATCAACAC A ACACA OT3-15 GGTGTGTGTGTG 1628. 2GCCCCTCCCGCCT 1629. TGGGCAAA 1630. DMSO 2 0 0 TGTGCGTGTGG TTTGTGTGGACATGA AACAGACA OT3-16 GGTGTGTGTGTG 1631. 2 GCCTCAGCTCTGC 1632.ACGAACAG 1633. DMSO 2 0 0 TGTGCGTGTGG TCTTAAGCCC ATCATTTT TCATGGCT TCCOT3-17 GTTGAGTGAATG 1634. 2 CTCCAGAGCCTGG 1635. CCCTCTCC 1636. DMSO 0 11 TGTGCGTGAGG CCTACCA GGAAGTGC CTTG OT3-18 TGTGGGTGAGTG 1637. 2TCTGTCACCACAC 1638. GTTGCCTG 1639. DMSO 0 1 1 TGTGCGTGAGG AGTTACCACCGGGATGGG GTAT OT3-19 ACTGTGTGAGTG 1640. 3 GGGGACCCTCAAG 1641. GGGCATCA1642. DMSO 2 0 1 TGTGCGTGAGG AGGCACT AAGGATGG GGAT OT3-20 AGAGAGTGAGTG1643. 3 TGTGGAGGGTGGG 1644. ACAGTGAG 1645. DMSO 1 0 2 TGTGCATGAGGACCTGGT GTGCGGTC TTTGGG OT3-21 AGCGAGTGGGTG 1646. 3 CGGGGTGGCAGTG 1647.GGTGCAGT 1648. DMSO 0 0 3 TGTGCGTGGGG ACGTCAA CCAAGAGC CCCC OT3-22AGGGAGTGACTG 1649. 3 AGCTGAGGCAGAG 1650. GGGAGACA 1651. DMSO 1 1 1TGTGCGTGTGG TCCCCGA GAGCAGCG CCTC OT3-23 AGTGAGTGAGTG 1652. 3ACCACCAGACCCC 1653. AGGACGAC 1654. 72 C. 1 1 1 AGTGAGTGAGG ACCTCCATTGTGCCC Anneal, CATCA 3% DMSO OT3-24 CATGAGTGAGTG 1655. 3 GGGTCAGGACGCA1656. TCCACCCA 1657. 72 C. 2 0 1 TGTGGGTGGGG GGTCAGA CCCACCCA Anneal,TCCT 3% DMSO OT3-25 CGTGAGTGTGTG 1658. 3 ACACTCTGGGCTA 1659. GCCCCCTC1660. DMSO 2 0 1 TATGCGTGTGG GGTGCTGGA ACCACATG ATGCT OT3-26GGACTGTGAGTG 1661. 3 GGGGCCATTCCTC 1662. TGGGGATC 1663. DMSO 3 0 0TGTGCGTGAGG TGCTGCA CTTGCTCA TGGC OT3-27 GGTGTGTGCCTG 1664. 3ACACACTGGCTCG 1665. CCTGCACG 1666. DMSO 2 1 0 TGTGCGTGTGG CATTCACCAAGGCCAGG TGTT OT3-28 GTTTCATGAGTG 1667. 3 TGGGCACGTAGTA 1668. CTCGCCGC1669. DMSO 0 3 1 TGTGCGTGGGG AACTGCACCA CGTGACTG TAGG OT3-29TGAGTGTGAGTG 1670. 3 TCAGCTGGTCCTG 1671. AGAGCACT 1672. DMSO 2 1 0TGTGCGTGGGG GGCTTGG GGGTAGCA GTCAGT OT3-30 TGCCAGTGAGTG 1673. 3AGACACAGCCAGG 1674. GGTGGGCG 1675. 68 C., 1 1 1 TGTGCGTGTGG GCCTCAGTGTGTGTG 3% TACC DMSO OT3-31 TGGGTGTGAGTG 1676. 3 ACACTCTCACACA 1677.GAGAAGTC 1678. 72 C. 1 2 0 TGTGCGTGTGG CGCACCAA AGGGCTGG Anneal, CGGG 3%DMSO OT3-32 TGTATGTGAGTG 1679. 3 ACTGCCTGCATTT 1680. TGGTGAGG 1681. DMSO1 1 1 TGTGCGTGTGG CCCCGGT GCTTCAGG GAGC OT3-33 TGTGAGAGAGAG 1682. 3GCCAGGTTCATTG 1683. TCCTTCTA 1684. DMSO 2 1 0 TGTGCGTGTGG ACTGCCCCACATCGG CGGC OT3-34 TGTGCCTGAGTG 1685. 3 CGAGGGAGCCGAG 1686. CTGACCTG1687. DMSO 1 2 0 TGTGCGTGTGG TTCGTAA GGGCTCTG GTAC OT3-35 TGTGTGTGTGTG1688. 3 TCCTCGGGAAGTC 1689. GCACTGAG 1690. DMSO 2 1 0 TGTGCGTGTGGATGGCTTCA CAACCAGG AGCAC OT3-36 AGCGTGTGAGTG 1691. 4 Not optimized 1 0 3TATGCGTGGGG OT3-37 ATTGAGTGTGTG 1692. 4 TAAACCGTTGCCC 1693. GCTCCCCT1694. DMSO 2 1 1 AGTGCGTGGGG CCGCCTC GCCAGGTG AACC OT3-38 CATGTGTGGGTG1695. 4 CCTGCTGAGACTC 1696. CTGCGGAG 1697. DMSO 2 0 2 TGTGCGTGTGGCAGGTCC TGGCTGGC TATA OT3-39 CCCGAGTGTGTG 1698. 4 CTCGGGGACTGAC 1699.GGAGCAGC 1700. DMSO 3 0 1 TGTGCGTGTGG AAGCCGG TCTTCCAG GGCC OT3-40CTGGAGTGAGTG 1701. 4 CCCCGACCAAAGC 1702. CTGGCAGC 1703. DMSO 1 2 1TGTGTGTGTGG AGGAGCA CTCTGGAT GGGG OT3-41 GTTTCATGAGTG 1704. 4Not optimized 0 3 1 TGTGCGTGGGG OT3-42 TATGTGTGCGTG 1705. 4ATTTCAGAGCCCC 1706. AGGCCGCG 1707. DMSO 1 2 1 TGTGCGTGTGG GGGGAAAGTGTTATG GTTA OT3-43 TATGTGTGTGTG 1708. 4 GCCAGTGGCTTAG 1709. TGACATAT1710. DMSO 2 1 1 TGTGCGTGGGG TGTCTTTGTGT TTTCCTGG GCCATGGG T OT3-44TCTGTGTGTGTG 1711. 4 TGCCAGAAGAACA 1712. CCATGCTG 1713. DMSO 3 1 0TGTGCGTGGGG TGGGCCAGA ACATCATA TACTGGGA AGC OT3-45 TCTGTGTGTGTG 1714. 4GCGTGTCTCTGTG 1715. CCAGGCTG 1716. DMSO 3 1 0 TGTGCGTGTGG TGCGTGCGGCACACA GGTT OT3-46 TGAGCGTGAGTG 1717. 4 Not optimized 2 2 0TGAGCGTGTGG OT3-47 TGTCTTTGAGTG 1718. 4 TGCCCAGTCCAAT 1719. AGGATGAG1720. DMSO 2 2 0 TGTGCGTGTGG ATTTCAGCAGCT TTCATGTC CTTTGTGG GG OT3-48TTTGTGTGTGTG 1721. 4 GGGTGAAAATTTG 1722. AATGACTC 1723. DMSO 2 2 0TGTGCGTGTGG GTACTGTTAGCTG ATTCCCTG T GGTATCTC CCA OT3-49 AAGGCGTGTGTG1724. 5 TGCCCCATCAATC 1725. CAAGGTCG 1726. DMSO 1 2 2 TGTGCGTGTGGACCTCGGC GCAGGGCA GTGA OT3-50 AATTCGTGTGTG 1727. 5 GCCTCCTCTGCCG 1728.TGAGAGTT 1729. DMSO 1 2 2 TGTGCGTGGGG CTGGTAA CCTGTTGC TCCACACT OT3-51ATGGTGTGTGTG 1730. 5 Not optimized 2 2 1 TGTGCGTGTGG OT3-52 CACGTGTGTGTG1731. 5 GCCACCAAAATAG 1732. ACATGCAT 1733. DMSO 3 0 2 TGTGCGTGTGGCCAGCGT CTGTGTGT GCGT OT3-53 GAAATTTGAGTG 1734. 5 ACAGACTGACCCT 1735.TGTATCTT 1736. DMSO 2 1 2 TGTGCGTGTGG TGAAAAATACCAG  TCTTGCCA T ATGGTTTTCCC OT3-54 TAAGTGTGTGTG 1737. 5 AGCCAAATTTCTC 1738. TCCTGGAG 1739. DMSO3 1 1 TGTGCGTGTGG AACAGCAGCACT AGCAGGCA TTTTTGT OT3-55 TATATGTGTGTG1740. 5 ACCTCCTTGTGCT 1741. GGCGGGAA 1742. DMSO 2 1 2 TGTGCGTGGGGGGTAACCC GCCTGGC TGGG OT3-56 TATCTGTGTGTG 1743. 5 CACAAAGCTCTAC 1744.TGATCCGA 1745. DMSO 3 1 1 TGTGCGTGTGG CTTTCCAGTAGTG TGGTTGTT T CACAGCTOT3-57 TTTATGTGTGTG 1746. 5 TGTGGGGATTACC 1747. ACGCACAA 1748. DMSO 2 21 TGTGCGTGTGG TGCCTGGC AAATGCCC TTGTCA OT3-58 TTTTTGTGTGTG 1749. 5TGAGGCAGACCAG 1750. GCCCGAGC 1751. DMSO 2 3 0 TGTGCGTGGGG TCATCCAGCACAGTGTA GGGC OT3-59 AAAAATTGTGTG 1752. 6 ATTAGCTGGGCGT 1753. ACTGCATC1754. DMSO 2 1 3 TGTGCGTGGGG GGCGGAG TCATCTCA GGCAGCT OT3-60ACAATGTGTGTG 1755. 6 TGAAGCAGAAGGA 1756. TCAGCTTC 1757. DMSO 4 0 2TGTGCGTGTGG GTGGAGAAGGA ACATCTGT TTCAGTTC AGT OT3-61 ATGTGGTGTGTG 1758.6 TGGTGGAGTGTGT 1759. AGAGCAGA 1760. DMSO 1 3 2 TGTGCGTGTGG GTGTGGTAAGAGAGT GCCCA OT3-62 CAAAATTGTGTG 1761. 6 GCCCCTGTACGTC 1762. TGCACAAG1763. DMSO 3 1 2 TGTGCGTGTGG CTGACAGC CCACTTAG CCTCTCT OT3-63CCCTGGTGTGTG 1764. 6 AGCGCAGGTAAAC 1765. TCTCTCGC 1766. DMSO 3 1 2TGTGCGTGTGG AGGCCCA CCCGTTTC CTTGT OT3-64 TCCGCTTGTGTG 1767. 6ATGGGTGCCAGGT 1768. ACAGCAGG 1769. DMSO 2 3 1 TGTGCGTGGGG ACCACGCAAGGAGCC GCAG OT3-65 TCCTCGTGTGTG 1770. 6 CGGGCGGGTGGAC 1771. AGGAGGTC1772. DMSO 2 3 1 TGTGCGTGTGG AGATGAG TCGAGCCA GGGG OT3-66 TTAAGGTGGGTG1773. 6 TCAACCTAGTGAA 1774. GTCTATAT 1775. DMSO 1 2 3 TGTGCGTGGGGCACAGACCACTGA ACAGCCCA CAACCTCA TGT OT3-67 TTATATTGTGTG 1776. 6GCCAGGGCCAGTG 1777. TGTCATTT 1778. DMSO 2 4 0 TGTGCGTGGGG GATTGCTCTTAGTAT GTCAGCCG GA OT3-68 TTGAGGAGAGTG 1779. 6 GAGCCCCACCGGT 1780.GCCAGAGC 1781. DMSO 1 3 2 TGTGCGTGAGG TCAGTCC TACCCACT CGCC 1782. 1783.1784. Target 4 GAGTCCGAGCAG 1785. 0 GGAGCAGCTGGTC 1786. GGGAAGGG 1787.DMSO AAGAAGAAGGG AGAGGGG GGACACTG GGGA OT4-1 GAGTTAGAGCAG 1788. 2TCTCTCCTTCAAC 1789. ATCTGCAC 1790. DMSO 0 1 1 AAGAAGAAAGG TCATGACCAGCTATGTATGT ACAGGAGT CAT OT4-2 AAGTCAGAGGAG 1791. 3 AAGACAGAGGAGAATGGGGAATCTCCA 1792. AGGGTGTA 1793. DMSO 2 1 1 AAGAAGAAGGG GAAGAAGGG (SEQAAGAACCCCC CTGTGGGA ID NO: 2229) ACTTTGCA OT4-3 AAGTCCGAGGAG 1794. 3GATGGCCCCACTG 1795. ACTTCGTA 1796. DMSO 1 0 2 AGGAAGAAAGG AGCACGTGAGCCTTA AACATGTG GC OT4-4 AAGTCTGAGCAC 1797. 3 AGGATTAATGTTT 1798.TCAAACAA 1799. 1M 1 0 2 AAGAAGAATGG AAAGTCACTGGTG GGTGCAGA betaine, GTACAGCA TD OT4-5 ACGTCTGAGCAG 1800. 3 TCCAAGCCACTGG 1801. TGCTCTGT 1802.DMSO 0 1 2 AAGAAGAATGG TTTCTCAGTCA GGATCATA TTTTGGGG GA OT4-6GACTCCTAGCAA 1803. 3 ACTTTCAGAGCTT 1804. CCCACGCT 1805. DMSO 1 1 1AAGAAGAATGG GGGGCAGGT GAAGTGCA ATGGC OT4-7 GAGACTGAGAAG 1806. 3CAAAGCATGCCTT 1807. GGCTCTTC 1808. 1M 1 1 1 AAGAAGAAAGG TCAGCCG GATTTGGCbetaine, ACCT TD OT4-8 GAGCCGGAGCAG 1809. 3 Not optimized 1 0 2AAGAAGGAGGG OT4-9 GAGCCTGAGCAG 1810. 3 GGACTCCCTGCAG 1811. AGGAACAC1812. 72° C. 0 0 3 AAGGAGAAGGG CTCCAGC AGGCCAGG Anneal, CTGG 6% DMSOOT4-10 GAGGCCGAGCAG 1813. 3 CCCTTTAGGCACC 1814. CCGACCTT 1815. DMSO 0 12 AAGAAAGACGG TTCCCCA CATCCCTC CTGG OT4-11 GAGTAAGAGAAG 1816. 3TGATTCTGCCTTA 1817. TGGGCTCT 1818. DMSO 0 3 0 AAGAAGAAGGG GAGTCCCAGGTGTGTCCCT ACCCA OT4-12 GAGTAGGAGGAG 1819. 3 Not optimized 2 1 0AAGAAGAAAGG OT4-13 GAGTCCGGGAAG 1820. 3 AGGCAGGAGAGCA 1821. ACCCTGAC1822. DMSO 0 1 2 GAGAAGAAAGG AGCAGGT TACTGACT GACCGCT OT4-14GATTCCTACCAG 1823. 3 CTCCCCATTGCGA 1824. AGAGGCAT 1825. DMSO 1 2 0AAGAAGAATGG CCCGAGG TGACTTGG AGCACCT OT4-15 GCGACAGAGCAG 1826. 3CTGGAGCCCAGCA 1827. CCTCAGGG 1828. DMSO 1 2 0 AAGAAGAAGGG GGAAGGCAGGGGGCC TGAT OT4-16 AAATCCAACCAG 1829. 4 ACTGTGGGCGTTG 1830. AGGTCGGT1831. DMSO 1 0 3 AAGAAGAAAGG TCCCCAC GCAGGGTT TAAGGA OT4-17 AAGTCTGAGGAC1832. 4 GGCGCTCCCTTTT 1833. CGTCACCC 1834. DMSO 2 0 2 AAGAAGAATGGTCCCTTTGT ATCGTCTC GTGGA OT4-18 AAGTTGGAGCAG 1835. 4 TGCCATCTATAGC 1836.GCATCTTG 1837. DMSO 1 0 3 GAGAAGAAGGG AGCCCCCT CTAACCGT ACTTCTTC TGAOT4-19 AATACAGAGCAG 1838. 4 GTGGAGACGCTAA 1839. GCTCCTGG 1840. DMSO 1 21 AAGAAGAATGG ACCTGTGAGGT CCTCTTCC TACAGC OT4-20 AGGTACTAGCAG 1841. 4CCGAACTTCTGCT 1842. CCAAGTCA 1843. DMSO 0 2 2 AAGAAGAAAGG GAGCTTGATGCATGGGCAA CAAGGGA OT4-21 AGGTGCTAGCAG 1844. 4 Not optimized 1 1 2AAGAAGAAGGG OT4-22 AGGTGGGAGCAG 1845. 4 TGCCCCCAAGACC 1846. ATGGCAGG1847. DMSO 2 0 2 AAGAAGAAGGG TTTCTCC CAGAGGAG GAAG OT4-23 CAAACGGAGCAG1848. 4 GGGTGGGGCCATT 1849. CTGGGGCC 1850. DMSO 3 0 1 AAGAAGAAAGGGTGGGTT AGGGTTTC TGCC OT4-24 CACTCTGAGGAG 1851. 4 TGGAGAACATGAG 1852.TCCTTCTG 1853. DMSO 3 0 1 AAGAAGAAAGG AGGCTTGCAA TAGGCAAT GGGAACAAOT4-25 CAGTCATGGCAG 1854. 4 GCCACATGGTAGA 1855. GGCAGATT 1856. 1M 1 2 1AAGAAGAAAGG AGTCGGC TCCCCCAT betaine, GCTG TD OT4-26 CCGTCCCAGCAG 1857.4 TGTACACCCCAAG 1858. AAGGGGAG 1859. DMSO 3 1 0 TAGAAGAATGG TCCTCCCTGTGCAAG CCTC OT4-27 GTCTGCGATCAG 1860. 4 AGGTCTGGCTAGA 1861. AGTCCAAC1862. DMSO 3 1 0 AAGAAGAAAGG GATGCAGCA ACTCAGGT GAGACCCT OT4-28TAATCCAATCAG 1863. 4 CCAAGAGGACCCA 1864. GGGTATGG 1865. DMSO 0 2 2AAGAAGAAGGG GCTGTTGGA AATTCTGG ATTAGCAG AGC OT4-29 TATACGGAGCAG 1866. 4ACCATCTCTTCAT 1867. ACACTGTG 1868. DMSO 2 2 0 AAGAAGAATGG TGATGAGTCCCAAAGTATGCT TGGCGT OT4-30 ACTTCCCTGCAG 1869. 5 GGCTGCGGGGAGA 1870. TCGGATGC1871. DMSO 2 2 1 AAGAAGAAAGG TGAGCTC TTTTCCAC AGGGCT OT4-31 AGGACTGGGCAG1872. 5 TCTTCCAGGAGGG 1873. CCAATCCT 1874. DMSO 1 0 4 AAGAAGAAGGGCAGCTCC GAGCTCCT ACAAGGCT OT4-32 AGGTTGGAGAAG 1875. 5 GAGCTGCACTGGA1876. TGCTGGTT 1877. DMSO 1 1 3 AAGAAGAAGGG TGGCACT AAGGGGTG TTTTGGAOT4-33 AGTTCAGAGCAG 1878. 5 TCTGGGAAGGTGA 1879. TGGGGGAC 1880. DMSO 0 23 GAGAAGAATGG GGAGGCCA AATGGAAA AGCAATGA OT4-34 ATGACACAGCAG 1881. 5CTTGCTCCCAGCC 1882. AGCCCTTG 1883. DMSO 3 1 1 AAGAAGAAGGG TGACCCCCCATGCAG GACC OT4-35 ATGACAGAGAAG 1884. 5 GGGATTTTTATCT 1885. AACCACAG1886. DMSO 2 2 1 AAGAAGAAAGG GTTGGGTGCGAA ATGTACCC TCAAAGCT OT4-36CCGCCCCTGCAG 1887. 5 ACCCATCAGGACC 1888. TCTGGAAC 1889. 72 C. 3 1 1AAGAAGAACGG GCAGCAC CTGGGAGG Anneal, CGGA 3% DMSO OT4-37 GCAGGAGAGCAG1890. 5 CGTCCCTCACAGC 1891. CCTCCTTG 1892. DMSO 1 3 1 AAGAAGAAAGGCAGCCTC GGCCTGGG GTTC OT4-38 GTTCAAGAGCAG 1893. 5 CCCTCTGCAAGGT 1894.AGATGTTC 1895. DMSO 1 3 1 AAGAAGAATGG GGAGTCTCC TGTCCCCA GGCCT OT4-39GTTTTGAAGCAG 1896. 5 GGCTTCCACTGCT 1897. TGCCGCTC 1898. DMSO 2 1 2AAGAAGAAAGG GAAGGCCT CACATACC CTCC OT4-40 TATGGCAAGCAG 1899. 5AGCATTGCCTGTC 1900. AGCACCTA 1901. DMSO 1 3 1 AAGAAGAAAGG GGGTGATGTTTGGACAC TGGTTCTC T OT4-41 TGGTGGGATCAG 1902. 5 TCTAGAGCAGGGG 1903.TGGAGATG 1904. DMSO 2 2 1 AAGAAGAAAGG CACAATGC GAGCCTGG TGGGA OT4-42ACCCACGGGCAG 1905. 6 GGTCTCAGAAAAT 1906. CCCACAGA 1907. DMSO 1 2 3AAGAAGAAGGG GGAGAGAAAGCAC AACCTGGG G CCCT OT4-43 ACTCCTGATCAG 1908. 6GGTTGCTGATACC 1909. TGGGTCCT 1910. DMSO 0 3 3 AAGAAGAAGGG AAAACGTTTGCCTCTCCACCT CTGCA OT4-44 ACTGATGAGCAG 1911. 6 ACTCTCCTTAAGT 1912. CAGAATCT1913. DMSO 0 4 2 AAGAAGAAAGG ACTGATATGGCTG TGCTCTGT T TGCCCA OT4-45ATTTTAGTGCAG 1914. 6 Not optimized 2 2 2 AAGAAGAAAGG OT4-46 ATTTTAGTGCAG1915. 6 Not optimized 2 2 2 AAGAAGAAAGG OT4-47 CCATGGCAGCAG 1916. 6CAATGCCTGCAGT 1917. TCCCAAGA 1918. DMSO 4 1 1 AAGAAGAAGGG CCTCAGGAGAAAACTC TGTCCTGA CA OT4-48 CCATTACAGCAG 1919. 6 GCATTGGCTGCCC 1920.TGGCTGTG 1921. DMSO 2 2 2 AAGAAGAAGGG AGGGAAA CTGGGCTG TGTT OT4-49CGAGGCGGGCAG 1922. 6 CCACAAGCCTCAG 1923. ACAGGTGC 1924. DMSO 2 1 3AAGAAGAAAGG CCTACCCG CAAAACAC TGCCT OT4-50 TCATTGCAGCAG 1925. 6TCATTGCAGCAGAA GCCTCTTGCAAAT 1926. CGATCAGT 1927. DMSO 2/1 2/3 2AAGAAGAAAGG GAAGAAAGG GAGACTCCTTTT CCCCTGGC TCATTGTAGCAGAA GTCCGAAGAAAGG (SEQ ID NO: 2230) OT4-51 TCTCCAGGGCAG 1928. 6 TCCCAGAATCTGC1929. AGGGGTTT 1930. DMSO 0 4 2 AAGAAGAAAGG CTCCGCA CCAGGCAC ATGGG 1931.1932. 1933. Target 5 GTCATCTTAGTC 1934. 0 TCCTAAAAATCAG 1935. AAAGTGTT1936. DMSO ATTACCTGAGG TTTTGAGATTTAC AGCCAACA TTCC TACAGAAG TCAGGA OT5-1GGTATCTAAGTC 1937. 3 GGTATCTAAGTCAT ACATCTGGGGAAA 1938. TGTCTGAG 1939.DMSO 1/2 1 1 ATTACCTGTGG TACCTGTGG (SEQ GCAAAAGTCAACA TATCTAGGID NO: 2231) CTAAAAGT GGTATCTAAGTCAA GGT TACCTGTGG (SEQ ID NO: 2232)OT5-2 GTAATATTAGTC 1940. 3 ACGATCTTGCTTC 1941. AGTGCTTT 1942. DMSO 0 3 0ATTACCGGTGG ATTTCCCTGTACA GTGAACTG AAAAGCAA ACA OT5-3 GTAATCTGAGTC 1943.3 GCACCTTGGTGCT 1944. GGGCAACT 1945. DMSO 1 2 0 ATTTCCTGGGG GCTAAATGCCGAACAGGC ATGAATGG OT5-4 GTCATCCTAGTC 1946. 3 AACTGTCCTGCAT 1947.GGTGCACC 1948. DMSO 1 1 1 ATTTACTGGGG CCCCGCC TGGATCCA CCCA OT5-5GTCATCCTAGTG 1949. 3 Not optimized 1950. 1951. 1 1 1 CTTACCTGAGG OT5-6GTCATCTGAGGC 1952. 3 CATCACCCTCCAC 1953. ACCACTGC 1954. 72 C. 0 3 0ATTAACTGGGG CAGGCCC TGCAGGCT Anneal, CCAG 3% DMSO OT5-7 AATATGTTAGTC1955. 4 Not optimized 2 0 2 ATTACCTGAGG OT5-8 ATAAACGTAGTC 1956. 4CCTGACCCGTGGT 1957. TGGTGCGT 1958. 72 C. 1 2 1 ATTACCTGGGG TCCCGACGGTGTGTG Anneal, TGGT 3% DMSO OT5-9 ATCATCATCGTC 1959. 4 TGGGAACATTGGA1960. CCATGTGA 1961. DMSO 1 1 2 ATTATCTGGGG GAAGTTTCCTGA CTACTGGG CTGCCCOT5-10 ATCATTTTACTC 1962. 4 AGCCTTGGCAAGC 1963. GGTTCTCT 1964. DMSO 1 03 ATTACTTGTGG AACTCCCT CTCTCAGA AAAGAAAG AGG OT5-11 ATCATTTTAGTC 1965. 4GGCAGCGGACTTC 1966. GCCAGAGG 1967. DMSO 1 0 3 ATCTCCTGTGG AGAGCCACTCTCAGC AGTGC OT5-12 CACAGCTTAGTC 1968. 4 CCAGCCTGGTCAA 1969. ACTGTGCC1970. DMSO 2 1 1 ATCACCTGGGG TATGGCA CAGCCCCA TATT OT5-13 CCCAGCTTAGTC1971. 4 ATGCCAACACTCG 1972. CGGGTTGT 1973. DMSO 2 1 1 ATTAGCTGTGGAGGGGCC GGCACCGG GTTA OT5-14 CTCACCTTTGTC 1974. 4 TTGCTCTAGTGGG 1975.AGAGTTCA 1976. DMSO 3 0 1 ATTTCCTGAGG GAGGGGG GGCATGAA AAGAAGCA ACAOT5-15 CTCATTTTATTC 1977. 4 AGCTGAAGATAGC 1978. TGCAATTT 1979. DMSO 1 12 ATTGCCTGGGG AGTGTTTAAGCCT GAGGGGCT CTCTTCA OT5-16 CTCTCCTTAGTC 1980. 4AGTCACTGGAGTA 1981. TGCCAGCC 1982. DMSO 2 0 2 ACTACCTGAGG AGCCTGCCTAAAAGTTG TTAGTGTG T OT5-17 CTTATCTCTGTC 1983. 4 GGGTCTCCCTCAG 1984.TGTGTGGT 1985. DMSO 2 0 2 ATTACCTGGGG TGCCCTG AGGGAGCA AAACGACA OT5-18GACAGCTCCGTC 1986. 4 TGGGGGCTGTTAA 1987. TGACCACA 1988. DMSO 1 2 1ATTACCTGGGG GAGGCACA CACACCCC CACG OT5-19 GCCACCTCAGTC 1989. 4TCAAAACAGATTG 1990. TGTGTTTT 1991. DMSO 1 0 3 ATTAGCTGGGG ACCAAGGCCAAATTAAGCTGC ACCCCAGG OT5-20 GGAATCTTACTC 1992. 4 TCTGGCACCAGGA 1993.GCACGCAG 1994. DMSO 1 2 1 ATTACTTGGGG CTGATTGTACA CTGACTCC CAGA OT5-21GTGGCCTCAGTC 1995. 4 Not optimized 1 0 3 ATTACCTGCGG OT5-22 GTTGTTTTAGTG1996. 4 AGCATCTGTGATA 1997. ACCAGGGC 1998. DMSO 1 0 3 ATTACCTGAGGCCCTACCTGTCT TGCCACAG AGTC OT5-23 TACATCTTAGTC 1999. 4 TAGTCTTGTTGCC2000. CTCGGCCC 2001. DMSO 1 2 1 CTCACCTGTGG CAGGCTG CTGAGAGT TCAT OT5-24TCCATCTCACTC 2002. 4 TCCATCTCACTCAT CTGCAACCAGGGC 2003. GAGCAGCA 2004.DMSO 1 1 2 ATTACCTGAGG TACCTGAGG (SEQ CCTTACC GCAAAGCC ID NO: 2233) ACCGTCCATCTCACTCAT TACCTGATG (SEQ ID NO: 2234) OT5-25 TTCATCCTAGTC 2005. 4GCCTGGAGAGCAA 2006. AGCCGAGA 2007. DMSO 1 1 2 AACACCTGGGG GCCTGGGCAATCTGC CCCG OT5-26 TTTATATTAGTG 2008. 4 TTTATATTAGTGAT AGTGAAACAAACA2009. GGCAGGTC 2010. No 1 2 1 ATTACCTGTGG TACCTGCGG (SEQ AGCAGCAGTCTGATGACCAGT DMSO ID NO: 2235) GGGG TD OT5-27 AACGTGTAAGTC 2011. 5AGGCTCAGAGAGG 2012. TGAGTAGA 2013. DMSO 3 0 2 ATTACCTGAGG TAAGCAATGGACAGAAATG TTACCGGT GTT OT5-28 AAGATCACAGTC 2014. 5 TCAGAGATGTTAA 2015.AGTGAACC 2016. DMSO 3 0 2 ATTACCTGGGG AGCCTTGGTGGG AAGGGAAT GGGGGAOT5-29 AGAATATTAGTC 2017. 5 TGTGCTTTCTGGG 2018. CACCTCAG 2019. DMSO 0 41 CTTACCTGGGG GTAGTGGCA CCCTGTAG TCCTGG OT5-30 AGCAGATTAGTG 2020. 5CCATTGGGTGACT 2021. GCCACTGT 2022. 1M 1 3 1 ATTACCTGGGG GAATGCACACCCCAGCC betaine, TATT TD OT5-31 AGTAGCTTAGTG 2023. 5 ACCAAGAAAGTGA2024. TGAGATGG 2025. DMSO 1 2 2 ATTACCTGGGG AAAGGAAACCC CATACGAT TTACCCAOT5-32 CACGGCTTACTC 2026. 5 AGGGTGGGGACTG 2027. TGGCATCA 2028. DMSO 3 11 ATTACCTGGGG AAAGGAGCT CTCAGAGA TTGGAACA CA OT5-33 CATATGTTAGGC 2029. 5ACCAGTGCTGTGT 2030. TCCTATGG 2031. DMSO 3 1 1 ATTACCTGGGG GACCTTGGAGAGGGGAG GCTTCT OT5-34 CATTTCTTAGTC 2032. 5 CCAGGTGTGGTGG 2033. GCATACGG2034. 68 C., 4 0 1 ATTTCCTGAGG TTCATGAC CAGTAGAA 3% TGAGCC DMSO OT5-35TGCAGCTAACTC 2035. 5 CAGGCGCTGGGTT 2036. CCTTCCTG 2037. DMSO 2 3 0ATTACCTGCGG CTTAGCCT GGCCCCAT GGTG OT5-36 TTGCTTTTAGTT 2038. 5TGGGGTCCAAGAT 2039. TGAAACTG 2040. DMSO 1 2 2 ATTACCTGGGG GTCCCCTCTTGATGA GGTGTGGA OT5-37 AACTTGAAAGTC 2041. 6 GCTGGGCTTGGTG 2042.ACTTGCAA 2043. DMSO 5 0 1 ATTACCTGTGG GTATATGC AGCTGATA ACTGACTG AOT5-38 AAGGTCACAGTC 2044. 6 AGTTGGTGTCACT 2045. CGCAGCGC 2046. DMSO 3 03 ATTACCTGGGG GACAATGGGA ACGAGTTC ATCA OT5-39 AATGTCTTCATC 2047. 6AGAGGAGGCACAA 2048. GGCTGGGG 2049. DMSO 1 1 4 ATTACCTGAGG TTCAACCCCTAGGCCTCA CAAT OT5-40 AGATGCTTGGTC 2050. 6 GGGAAAGTTTGGG 2051. AGGACAAG2052. DMSO 1 3 2 ATTACCTGTGG AAAGTCAGCA CTACCCCA CACC OT5-41AGTAGATTAGTT 2053. 6 TGGTGCATCAAAG 2054. TCATTCCA 2055. DMSO 0 3 3ATTACCTGGGG GGTTGCTTCT GCACGCCG GGAG OT5-42 AGTAGGTTAGTA 2056. 6CCCAGGCTGCCCA 2057. TGGAGTAA 2058. DMSO 1 3 2 ATTACCTGGGG TCACACTGTATACCT TGGGGACC T OT5-43 CAAATGAGAGTC 2059. 6 TCAGTGCCCCTGG 2060.TGTGCAAA 2061. DMSO 4 2 0 ATTACCTGAGG GTCCTCA TACCTAGC ACGGTGC OT5-44CATGTCTGAATC 2062. 6 AGCACTCCCTTTT 2063. ACTGAAGT 2064. DMSO 2 1 3ATTACCTGAGG GAATTTTGGTGCT CCAGCCTC TTCCATTT CA OT5-45 CCTGACTTGGTC 2065.6 GAAACCGGTCCCT 2066. GGGGAGTA 2067. DMSO 2 0 4 ATTACCTGTGG GGTGCCAGAGGGTAG TGTTGCC OT5-46 CGTGCATTAGTC 2068. 6 TTGCGGGTCCCTG 2069.AGGTGCCG 2070. DMSO 1 2 3 ATTACCTGAGG TGGAGTC TGTTGTGC CCAA Target 6GGAATCCCTTCT 2071. 0 GCCCTACATCTGC 2072. GGGCCGGG 2073. DMSO GCAGCACCTGGTCTCCCTCCA AAAGAGTT GCTG OT6-1 GGAACCCCGTCT 2074. 2 TTGGAGTGTGGCC 2075.ACCTCTCT 2076. DMSO 0 1 1 GCAGCACCAGG CGGGTTG TTCTCTGC CTCACTGT OT6-2GGAACACCTTCT 2077. 3 CACACCATGCTGA 2078. GCAGTACG 2079. DMSO 1 1 1GCAGCTCCAGG TCCAGGC GAAGCACG AAGC OT6-3 GGAAGCTCTGCT 2080. 3CTCCAGGGCTCGC 2081. CTGGGCTC 2082. DMSO 0 2 1 GCAGCACCTGG TGTCCACTGCTGGTT CCCC OT6-4 GGAATATCTTCT 2083. 3 CTGTGGTAGCCGT 2084. CCCCATAC2085. DMSO 0 2 1 GCAGCCCCAGG GGCCAGG CACCTCTC CGGGA OT6-5 GGAATCACTTTT2086. 3 GGTGGCGGGACTT 2087. CCAGCGTG 2088. 1M 0 1 2 ACAGCACCAGG GAATGAGTTTCCAAG betaine, GGAT TD OT6-6 GGAATCCCCTCT 2089. 3 GGAATCCCCTCTCCCCAGAGGTGGGGC 2090. TTTCCACA 2091. DMSO 1 1 1/2 CCAGCCCCTGGAGCCCCTGG (SEQ CCTGTGA CTCAGTTC ID NO:2236) TGCAGGA GGAATCCCCTCTCCAGCCTCTGG (SEQ ID NO: 2237) OT6-7 GGAATCTCTTCT 2092. 3 GGAATCTCTTCCTTTGTGACTGGTTGT 2093. GCAGTGTT 2094. 1M 0 1 5 TCAGCATCTGG GGCATCTGG (SEQCCTGCTTTCCT TTGTGGTG betaine, ID NO: 2238) ATGGGCA TD OT6-8 GGAATTGCTTCT2095. 3 CTGGCCAAGGGGT 2096. TGGGACCC 2097. DMSO 1 0 2 GCAGCGCCAGGGAGTGGG CAGCAGCC AATG OT6-9 GGACTCCCCTCT 2098. 3 ACGGTGTGCTGGC 2099.ACAGTGCT 2100. DMSO 1 1 1 GCAGCAGCTGG TGCTCTT GACCGTGC TGGG OT6-10GGAGTCCCTCCT 2101. 3 TGGTTTGGGCCTC 2102. TGCCTCCC 2103. DMSO 0 0 3ACAGCACCAGG AGGGATGG ACAAAAAT GTCTACCT OT6-11 GGAGTCCCTCCT 2104. 3TGGTTTGGGCCTC 2105. ACCCCTTA 2106. DMSO 0 0 3 ACAGCACCAGG AGGGATGGTCCCAGAA CCCATGA OT6-12 GGCATCCATTCT 2107. 3 TCCAAGTCAGCGA 2108.TGGGAGCT 2109. DMSO 0 3 0 GCAGCCCCTGG TGAGGGCT GTTCCTTT TTGGCCA OT6-13GGCTTCCCTTCT 2110. 3 CACCCCTCTCAGC 2111. GCTAGAGG 2112. DMSO 1 2 0GCAGCCCCAGG TTCCCAA GTCTGCTG CCTT OT6-14 TGAATCCCATCT 2113. 3AGACCCCTTGGCC 2114. CTTGCTCT 2115. DMSO 2 1 0 CCAGCACCAGG AAGCACACACCCCGC CTCC OT6-15 AAAATACCTTCT 2116. 4 ACATGTGGGAGGC 2117. TCTCACTT2118. DMSO 0 1 3 GCAGTACCAGG GGACAGA TGCTGTTA CCGATGTC G OT6-16AAAATCCCTTCT 2119. 4 GGACGACTGTGCC 2120. AGTGCCCA 2121. 72 C. 0 1 3TCAACACCTGG TGGGACA GAGTGTTG Anneal, TAACTGCT 3% DMSO OT6-17ACACTCCCTCCT 2122. 4 GGAGAGCTCAGCG 2123. CAGCGTGG 2124. DMSO 1 1 2GCAGCACCTGG CCAGGTC CCCGTGGG AATA OT6-18 ACCATCCCTCCT 2125. 4GCTGAAGTGCTCT 2126. ACCCCACT 2127. DMSO 1 1 2 GCAGCACCAGG GGGGTGCTGTGGATGA ATTGGTAC C OT6-19 AGAGGCCCCTCT 2128. 4 TCGGGGTGCACAT 2129.TTGCCTCG 2130. DMSO 0 1 3 GCAGCACCAGG GGCCATC CAGGGGAA GCAG OT6-20AGGATCCCTTGT 2131. 4 CTCGTGGGAGGCC 2132. AGCCACCA 2133. DMSO 2 0 2GCAGCTCCTGG AACACCT ACACATAC CAGGCT OT6-21 CCACTCCTTTCT 2134. 4GCATGCCTTTAAT 2135. AGGATTTC 2136. DMSO 2 1 1 GCAGCACCCGG CCCGGCTAGAGTGAT GGGGCT OT6-22 GAAGGCCCTTCA 2137. 4 CGCCCAGCCACAA 2138. GCAAATTT2139. DMSO 1 1 2 GCAGCACCTGG AGTGCAT CTGCACCT ACTCTAGG CCT OT6-23GATATCCCTTCT 2140. 4 AGCTCACAAGAAT 2141. GCAGTCAC 2142. DMSO 1 1 2GTATCACCTGG TGGAGGTAACAGT CCTTCACT GCCTGT OT6-24 GGGTCCGCTTCT 2143. 4AAACTGGGCTGGG 2144. GGGGCTAA 2145. DMSO 2 0 2 GCAGCACCTGG CTTCCGGGGCATTGT CAGACCC OT6-25 GTCTCCCCTTCT 2146. 4 GCAGGTAGGCAGT 2147.TCTCCTGC 2148. 1M 1 2 1 GCAGCACCAGG CTGGGGC CTCAGCCT betaine, CCCA TDOT6-26 GTCTCCCCTTCT 2149. 4 GCAGGTAGGCAGT 2150. TCTCCTGC 2151. 1M 1 2 1GCAGCACCAGG CTGGGGC CTCAGCCT betaine, CCCA TD OT6-27 GTCTCCCCTTCT 2152.4 GCAGGTAGGCAGT 2153. TCTCCTGC 2154. 1M 1 2 1 GCAGCACCAGG CTGGGGCCTCAGCCT betaine, CCCA TD OT6-28 TCATTCCCGTCT 2155. 4 GCTCTGGGGTAGA2156. GGCCTGTC 2157. DMSO 2 2 0 GCAGCACCCGG AGGAGGC AACCAACC AACC OT6-29TGCACCCCTCCT 2158. 4 TGACATGTTGTGT 2159. AAATCCTG 2160. DMSO 0 2 2GCAGCACCAGG GCTGGGC CAGCCTCC CCTT OT6-30 TGCATACCCTCT 2161. 4TCCTGGTGAGATC 2162. TCCTCCCC 2163. DMSO 0 3 1 GCAGCACCAGG GTCCACAGGAACTCAGCC TCCC OT6-31 TGCATGGCTTCT 2164. 4 TCCTAATCCAAGT 2165. AGGGACCA2166. DMSO 2 2 0 GCAGCACCAGG CCTTTGTTCAGAC GCCACTAC A CCTTCA OT6-32AATATTCCCTCT 2167. 5 GGGACACCAGTTC 2168. GGGGGAGA 2169. DMSO 1 0 4GCAGCACCAGG CTTCCAT TTGGAGTT CCCC OT6-33 ACCATTTCTTCT 2170. 5ACACCACTATCAA 2171. TCTGCCTG 2172. DMSO 1 1 3 GCAGCACCTGG GGCAGAGTAGGTGGGTGCTT TCCC OT6-34 AGCTCCCATTCT 2173. 5 CTGGGAGCGGAGG 2174. GCCCCGAC2175. DMSO 1 2 2 GCAGCACCCGG GAAGTGC AGATGAGG CCTC OT6-35 CAGATTCCTGCT2176. 5 CAGATTACTGCTGC CGGGTCTCGGAAT 2177. ACCCAGGA 2178. DMSO 1 2 3GCAGCACCGGG AGCACCGGG (SEQ GCCTCCA ATTGCCAC ID NO: 2239) CCCC OT6-36CCAAGAGCTTCT 2179. 5 TTGCTGTGGTCCC 2180. GCAGACAC 2181. DMSO 3 2 0GCAGCACCTGG GGTGGTG TAGAGCCC GCCC OT6-37 CCCAGCCCTGCT 2182. 5GGTGTGGTGACAG 2183. ACCTGCGT 2184. DMSO 2 3 0 GCAGCACCCGG GTCGGGTCTCTGTGC TGCA OT6-38 CCCCTCCCTCCT 2185. 5 CTCCCAGGACAGT 2186. CCTGGCCC2187. DMSO 2 2 1 GCAGCACCGGG GCTCGGC CATGCTGC CTG OT6-39 CTACTGACTTCT2188. 5 TGCGTAGGTTTTG 2189. AGGGAATG 2190. DMSO 2 3 0 GCAGCACCTGGCCTCTGTGA ATGTTTTC CACCCCCT OT6-40 CTCCTCCCTCCT 2191. 5 CTCCGCAGCCACC2192. TGCATTGA 2193. DMSO 1 3 1 GCAGCACCTGG GTTGGTA CGTACGAT GGCTCAOT6-41 TCTGTCCCTCCT 2194. 5 ACCTGCAGCATGA 2195. ACCTGAGC 2196. DMSO 2 12 GCAGCACCTGG ACTCTCGCA AACATGAC TCACCTGG OT6-42 ACACAAACTTCT 2197. 6ACACAAACTTCTGC TCTCCAGTTTCTT 2198. ACCATTGG 2199. 1M 3/2 3 1 GCAGCACCTGGAGCACCTGG GCTCTCATGG TGAACCCA betaine, ACACAAACTTCTGC GTCA TDAGCACGTGG (SEQ ID NO: 2240) OT6-43 ACTGTCATTTCT 2200. 6 TGGGGTGGTGGTC2201. TCAGCTAT 2202. DMSO 2 1 3 GCAGCACCTGG TTGAATCCA AACCTGGG ACTTGTGCT OT6-44 ACTTTATCTTCT 2203. 6 AGCAGCCAGTCCA 2204. CCCTTTCA 2205. DMSO 31 2 GCAGCACCTGG GTGTCCTG TCGAGAAC CCCAGGG OT6-45 ATCCTTTCTTCT 2206. 6TGGACGCTGCTGG 2207. GAGGTCTC 2208. DMSO 0 3 3 GCAGCACCTGG GAGGAGAGGGCTGCT CGTG OT6-46 CACCACCGTTCT 2209. 6 AGGTTTGCACTCT 2210. TGGGGTGA2211. DMSO 3 2 1 GCAGCACCAGG GTTGCCTGG TTGGTTGC CAGGT OT6-47CATGTGGCTTCT 2212. 6 TCTTCCTTTGCCA 2213. TGCAGGAA 2214. DMSO 4 0 2GCAGCACCTGG GGCAGCACA TAGCAGGT ATGAGGAG T OT6-48 CATTTTCTTTCT 2215. 6GGACGCCTACTGC 2216. GCCCTGGC 2217. DMSO 3 0 3 GCAGCACCTGG CTGGACCAGCCCATG GTAC OT6-49 CTCTGTCCTTCT 2218. 6 AGGCAGTCATCGC 2219. GGTCCCAC2220. DMSO 2 3 1 GCAGCACCTGG CTTGCTA CTTCCCCT ACAA OT6-50 CTGTACCCTCCT2221. 6 Not optimized 3 1 2 GCAGCACCAGG OT6-51 TTGAGGCCGTCT 2222. 6CCCCAGCCCCCAC 2223. CAGCCCAG 2224. DMSO 1 4 1 GCAGCACCGGG CAGTTTCGCCACAGC TTCA

Sanger Sequencing for Quantifying Frequencies of Indel Mutations

Purified PCR products used for T7EI assay were ligated into a Zero BluntTOPO vector (Life Technologies) and transformed into chemicallycompetent Top 10 bacterial cells. Plasmid DNAs were isolated andsequenced by the Massachusetts General Hospital (MGH) DNA AutomationCore, using an M13 forward primer (5′-GTAAAACGACGGCCAG-3′) (SEQ IDNO:1059).

Restriction Digest Assay for Quantifying Specific Alterations Induced byHDR with ssODNs

PCR reactions of specific on-target sites were performed using Phusionhigh-fidelity DNA polymerase (New England Biolabs). The VEGF and EMX1loci were amplified using a touchdown PCR program ((98° C., 10 s; 72-62°C., −1° C./cycle, 15 s; 72° C., 30 s)×10 cycles, (98° C., 10 s; 62° C.,15 s; 72° C., 30 s)×25 cycles), with 3% DMSO. The primers used for thesePCR reactions are listed in Table E. PCR products were purified byAmpure XP beads (Agencourt) according to the manufacturer'sinstructions. For detection of the BamHI restriction site encoded by thessODN donor template, 200 ng of purified PCR products were digested withBamHI at 37° C. for 45 minutes. The digested products were purified byAmpure XP beads (Agencourt), eluted in 20 ul 0.1×EB buffer and analyzedand quantified using a QIAXCEL capillary electrophoresis system.

TruSeq Library Generation and Sequencing Data Analysis

Locus-specific primers were designed to flank on-target and potentialand verified off-target sites to produce PCR products ˜300 bp to 400 bpsin length. Genomic DNAs from the pooled duplicate samples describedabove were used as templates for PCR. All PCR products were purified byAmpure XP beads (Agencourt) per the manufacturer's instructions.Purified PCR products were quantified on a QIAXCEL capillaryelectrophoresis system. PCR products for each locus were amplified fromeach of the pooled duplicate samples (described above), purified,quantified, and then pooled together in equal quantities for deepsequencing. Pooled amplicons were ligated with dual-indexed IlluminaTruSeq adaptors as previously described (Fisher et al., 2011). Thelibraries were purified and run on a QIAXCEL capillary electrophoresissystem to verify change in size following adaptor ligation. Theadapter-ligated libraries were quantified by qPCR and then sequencedusing Illumina MiSeq 250 bp paired-end reads performed by theDana-Farber Cancer Institute Molecular Biology Core Facilities. Weanalyzed between 75,000 and 1,270,000 (average ˜422,000) reads for eachsample. The TruSeq reads were analyzed for rates of indel mutagenesis aspreviously described (Sander et al., 2013). Specificity ratios werecalculated as the ratio of observed mutagenesis at an on-target locus tothat of a particular off-target locus as determined by deep sequencing.Fold-improvements in specificity with tru-RGNs for individual off-targetsites were calculated as the specificity ratio observed with tru-gRNAsto the specificity ratio for that same target with the matchedfull-length gRNA. As mentioned in the text, for some of the off-targetsites, no indel mutations were detected with tru-gRNAs. In these cases,we used a Poisson calculator to determine with a 95% confidence that theupper limit of the actual number of mutated sequences would be three innumber. We then used this upper bound to estimate the minimumfold-improvement in specificity for these off-target sites.

Example 2a. Truncated gRNAs can Efficiently Direct Cas9-Mediated GenomeEditing in Human Cells

To test the hypothesis that gRNAs truncated at their 5′ end mightfunction as efficiently as their full-length counterparts, a series ofprogressively shorter gRNAs were initially constructed as describedabove for a single target site in the EGFP reporter gene, with thefollowing sequence: 5′-GGCGAGGGCGATGCCACCTAcGG-3′ (SEQ ID NO:2241). Thisparticular EGFP site was chosen because it was possible to make gRNAs toit with 15, 17, 19, and 20 nts of complementarity that each have a G attheir 5′ end (required for efficient expression from the U6 promoterused in these experiments). Using a human cell-based reporter assay inwhich the frequency of RGN-induced indels could be quantified byassessing disruption of a single integrated and constitutively expressedenhanced green fluorescent protein (EGFP) gene (Example 1 and Fu et al.,2013; Reyon et al., 2012) (FIG. 2B), the abilities of thesevariable-length gRNAs to direct Cas9-induced indels at the target sitewere measured.

As noted above, gRNAs bearing longer lengths of complementarity (21, 23,and 25 nts) exhibit decreased activities relative to the standardfull-length gRNA containing 20 nts of complementary sequence (FIG. 2H),a result that matches those recently reported by others (Ran et al.,Cell 2013). However, gRNAs bearing 17 or 19 nts of targetcomplementarity showed activities comparable to or higher than thefull-length gRNA, while a shorter gRNA bearing only 15 nts ofcomplementary failed to show significant activity (FIG. 2H).

To test the generality of these initial findings, full-length gRNAs andmatched gRNAs bearing 18, 17 and/or 16 nts of complementarity to fouradditional EGFP reporter gene sites (EGFP sites #1, #2, #3, and #4; FIG.3A) were assayed. At all four target sites, gRNAs bearing 17 and/or 18nts of complementarity functioned as efficiently as (or, in one case,more efficiently than) their matched full-length gRNAs to induceCas9-mediated disruption of EGFP expression (FIG. 3A). However, gRNAswith only 16 nts of complementarity showed significantly decreased orundetectable activities on the two sites for which they could be made(FIG. 3A). For each of the different sites tested, we transfected thesame amounts of the full-length or shortened gRNA expression plasmid andCas9 expression plasmid. Control experiments in which we varied theamounts of Cas9 and truncated gRNA expression plasmids transfected forEGFP sites #1, #2, and #3 suggested that shortened gRNAs functionequivalently to their full-length counterparts (FIG. 3E (bottom) and 3F(bottom)) and that therefore we could use the same amounts of plasmidswhen making comparisons at any given target site. Taken together, theseresults provide evidence that shortened gRNAs bearing 17 or 18 nts ofcomplementarity can generally function as efficiently as full-lengthgRNAs and hereafter the truncated gRNAs with these complementaritylengths are referred to as “tru-gRNAs” and RGNs using these tru-gRNAs as“tru-RGNs”.

Whether tru-RGNs could efficiently induce indels on chromatinizedendogenous gene targets was tested next. Tru-gRNAs were constructed forseven sites in three endogenous human genes (VEGFA, EMX1, and CLTA),including four sites that had previously been targeted with standardfull-length gRNAs in three endogenous human genes: VEGFA site 1, VEGFAsite 3, EMX1, and CTLA (Example 1 and Fu et al., 2013; Hsu et al., 2013;Pattanayak et al., 2013) (FIG. 3B). (It was not possible to test atru-gRNA for VEGFA site 2 from Example 1, because this target sequencedoes not have the G at either position 17 or 18 of the complementarityregion required for gRNA expression from a U6 promoter.) Using awell-established T7 Endonuclease I (T7EI) genotyping assay (Reyon etal., 2012) as described above, the Cas9-mediated indel mutationfrequencies induced by each of these various gRNAs at their respectivetarget sites was quantified in human U2OS.EGFP cells. For all five ofthe seven four sites, tru-RGNs robustly induced indel mutations withefficiencies comparable to those mediated by matched standard RGNs (FIG.3B). For the two sites on which tru-RGNs showed lower activities thantheir full-length counterparts, we note that the absolute rates ofmutagenesis were still high (means of 13.3% and 16.6%) at levels thatwould be useful for most applications. Sanger sequencing for three ofthese target sites (VEGFA sites 1 and 3 and EMX1) confirmed that indelsinduced by tru-RGNs originate at the expected site of cleavage and thatthese mutations are essentially indistinguishable from those inducedwith standard RGNs (FIG. 3C and FIGS. 7A-D).

We also found that tru-gRNAs bearing a mismatched 5′ G and an 18 ntcomplementarity region could efficiently direct Cas9-induced indelswhereas those bearing a mismatched 5′ G and a 17 nt complementarityregion showed lower or undetectable activities compared with matchedfull-length gRNAs (FIG. 7E), consistent with our findings that a minimumof 17 nts of complementarity is required for efficient RGN activity.

To further assess the genome-editing capabilities of tru-RGNs, theirabilities to induce precise sequence alterations via HDR with ssODNdonor templates were tested. Previous studies have shown thatCas9-induced breaks can stimulate the introduction of sequence from ahomologous ssODN donor into an endogenous locus in human cells (Cong etal., 2013; Mali et al., 2013c; Ran et al., 2013; Yang et al., 2013).Therefore, the abilities were compared of matched full-length andtru-gRNAs targeted to VEGFA site 1 and to the EMX1 site to introduce aBamHI restriction site encoded on homologous ssODNs into theseendogenous genes. At both sites, tru-RGNs mediated introduction of theBamHI site with efficiencies comparable to those seen with standard RGNsharboring their full-length gRNA counterparts (FIG. 3D). Taken together,this data demonstrate that tru-RGNs can function as efficiently asstandard RGNs to direct both indels and precise HDR-mediated genomeediting events in human cells.

Example 2b. Tru-RGNs Exhibit Enhanced Sensitivities to gRNA/DNAInterface Mismatches

Having established that tru-RGNs can function efficiently to induceon-target genome editing alterations, whether these nucleases would showgreater sensitivity to mismatches at the gRNA/DNA interface was tested.To assess this, a systematic series of variants was constructed for thetru-gRNAs that were previously tested on EGFP sites #1, #2, and #3 (FIG.3A above). The variant gRNAs harbor single Watson-Crick substitutions ateach position within the complementarity region (with the exception ofthe 5′ G required for expression from the U6 promoter) (FIG. 5A). Thehuman cell-based EGFP disruption assay was used to assess the relativeabilities of these variant tru-gRNAs and an analogous set of matchedvariant full-length gRNAs made to the same three sites as described inExample 1 to direct Cas9-mediated indels. The results show that for allthree EGFP target sites, tru-RGNs generally showed greater sensitivitiesto single mismatches than standard RGNs harboring matched full-lengthgRNAs (compare bottom and top panels of FIG. 5A). The magnitude ofsensitivity varied by site, with the greatest differences observed forsites #2 and #3, whose tru-gRNAs harbored 17 nts of complementarity.

Encouraged by the increased sensitivity of tru-RGNs to single nucleotidemismatches, we next sought to examine the effects of systematicallymismatching two adjacent positions at the gRNA-DNA interface. Wetherefore made variants of the tru-gRNAs targeted to EGFP target sites#1, #2, and #3, each bearing Watson-Crick transversion substitutions attwo adjacent nucleotide positions (FIG. 5B). As judged by the EGFPdisruption assay, the effects of adjacent double mismatches on RGNactivity were again substantially greater for tru-gRNAs than for theanalogous variants made in Example 1 for matched full-length gRNAstargeted to all three EGFP target sites (compare bottom to top panels inFIG. 5B). These effects appeared to be site-dependent with nearly all ofthe double-mismatched tru-gRNAs for EGFP sites #2 and #3 failing to showan increase in EGFP disruption activities relative to a control gRNAlacking a complementarity region and with only three of the mismatchedtru-gRNA variants for EGFP site #1 showing any residual activities (FIG.5B). In addition, although double mutations generally showed greatereffects on the 5′ end with full-length gRNAs, this effect was notobserved with tru-gRNAs. Taken together, our data suggest that tru-gRNAsexhibit greater sensitivities than full-length gRNAs to single anddouble Watson-Crick transversion mismatches at the gRNA-DNA interface.

Example 2c. Tru-RGNs Targeted to Endogenous Genes Show ImprovedSpecificities in Human Cells

The next experiments were performed to determine whether tru-RGNs mightshow reduced genomic off-target effects in human cells relative tostandard RGNs harboring full-length gRNA counterparts. We examinedmatched full-length and tru-gRNAs targeted to VEGFA site 1, VEGFA site3, and EMX1 site 1 (described in FIG. 3B above) because previous studies(see Example 1 and Fu et al., 2013; Hsu et al., 2013) had defined 13bona fide off-target sites for the full-length gRNAs targeted to thesesites. (We were unable to test a tru-gRNA for VEGFA site 2 from ouroriginal study 6 because this target sequence does not have the G ateither position 17 or 18 of the complementarity region required forefficient gRNA expression from a U6 promoter.) Strikingly, we found thattru-RGNs showed substantially reduced mutagenesis activity in humanU2OS.EGFP cells relative to matched standard RGNs at all 13 of thesebona fide off-target sites as judged by T7EI assay (Table 3A); for 11 ofthe 13 off-target sites, the mutation frequency with tru-RGNs droppedbelow the reliable detection limit of the T7EI assay (2-5%) (Table 3A).We observed similar results when these matched pairs of standard andtru-RGNs were tested at the same 13 off-target sites in another humancell line (FT-HEK293 cells) (Table 3A).

To quantify the magnitude of specificity improvement observed withtru-RGNs, we measured off-target mutation frequencies usinghigh-throughput sequencing, which provides a more sensitive method fordetecting and quantifying low frequency mutations than the T7EI assay.We assessed a subset of 12 of the 13 bona fide off-target sites forwhich we had seen decreased mutation rates with tru-gRNAs by T7EI assay(for technical reasons, we were unable to amplify the required shorteramplicon for one of the sites) and also examined an additionaloff-target site for EMX1 site 1 that had been identified by anothergroup 7 (FIG. 6A). For all 13 off-target sites we tested, tru-RGNsshowed substantially decreased absolute frequencies of mutagenesisrelative to matched standard RGNs (FIG. 6A and Table 3B) and yieldedimprovements in specificity of as much as 5000-fold or more relative totheir standard RGN counterparts (FIG. 6B). For two off-target sites(OT1-4 and OT1-11), it was difficult to quantify the on-target tooff-target ratios for tru-RGNs because the absolute number and frequencyof indel mutations induced by tru-RGNs fell to background ornear-background levels. Thus, the ratio of on-target to off-target rateswould calculate to be infinite in these cases. To address this, weinstead identified the maximum likely indel frequency with a 95%confidence level for these sites and then used this conservativeestimate to calculate the minimum likely magnitude of specificityimprovement for tru-RGNs relative to standard RGNs for theseoff-targets. These calculations suggest tru-RGNs yield improvements of˜10,000-fold or more at these sites (FIG. 6B).

To further explore the specificity of tru-RGNs, we examined theirabilities to induce off-target mutations at additional closely relatedsites in the human genome. For the tru-gRNAs to VEGFA site 1 and EMX1,which each possess 18 nts of target site complementarity, wecomputationally identified all additional sites in the human genomemismatched at one or two positions within the complementarity region(not already examined above in Table 3A) and a subset of all sitesmismatched at three positions among which we favored mismatches in the5′ end of the site as described in Example 1. For the tru-gRNA to VEGFAsite 3, which possesses 17 nts of target site complementarity, weidentified all sites mismatched at one position and a subset of allsites mismatched at two positions among which mismatches in the 5′ endwere favored (again not already examined in Table 3A). Thiscomputational analysis yielded a total of 30, 30, and 34 additionalpotential off-target sites for the tru-RGNs targeted to VEGFA site 1,VEFGA site 3, and the EMX1 site, respectively, which we then assessedfor mutations using T7EI assay in human U2OS.EGFP and HEK293 cells inwhich the RGNs had been expressed.

Strikingly, the three tru-RGNs to VEGFA site 1, VEFGA site 3, and EMX1did not induce detectable Cas9-mediated indel mutations at 93 of the 94potential off-target sites examined in human U2OS.EGFP cells or at anyof the 94 potential off-target sites in human HEK293 cells (Table 3C).For the one site at which off-target mutations were seen, whether thestandard RGN with a full-length gRNA targeted to VEGFA site 1 could alsomutagenize this same off-target site was examined; it induced detectablemutations albeit at a slightly lower frequency (FIG. 6C). The lack ofimprovement observed with shortening of the gRNA at this off-target sitecan be understood by comparing the 20 and 18 nt sequences for thefull-length and tru-gRNAs, which shows that the two additional bases inthe full-length 20 nt target are both mismatched (FIG. 6C). In summary,based on this survey of 94 additional potential off-target sites,shortening of the gRNA does not appear to induce new high-frequencyoff-target mutations.

Deep sequencing of a subset of the 30 most closely matched potentialoff-target sites from this set of 94 site (i.e.—those with one or twomismatches) showed either undetectable or very low rates of indelmutations (Table 3D) comparable to what we observed at other previouslyidentified off-target sites (Table 3B). We conclude that tru-RGNsgenerally appear to induce either very low or undetectable levels ofmutations at sites that differ by one or two mismatches from theon-target site. This contrasts with standard RGNs for which it wasrelatively easy to find high-frequency off-target mutations at sitesthat differed by as many as five mismatches (see Example 1).

TABLE 3AOn- and off-target mutation frequencies of matched tru-RGNs and standard RGNstargeted to endogenous genes in human U2OS.EGFP and HEK293 cells SEQIndel mutation frequency SEQ Indel mutation frequency Target ID(%) ± s.e.m. ID (%) ± s.e.m. ID 20mer Target NO: u2OS.EGFP HEK293 NO:NO: u2OS.EGFP HEK293 Gene T1 GGGTGGGGGGAGTTTGCTCCtGG 2242. 23.69 ± 1.99 6.98 ± 1.33 GTGGGGGGAGTTTGCTCCtGG 2243. 23.93 ± 4.37  8.34 ± 0.01 VEGFAOT1-3 GG A TGG A GGGAGTTTGCTCCtGG 2244. 17.25 ± 2.97  7.26 ± 0.62 A TGGA GGGAGTTTGCTCCtGG 2245. N.D. N.D. IGDCC3 OT1-4 GGG A GGG TGGAGTTTGCTCCtGG 2246.  6.23 ± 0.20  2.66 ± 0.30 G A GGG TGGAGTTTGCTCCtGG 2247. N.D. N.D. LOC116437 OT1-6 C GG G GG AGGGAGTTTGCTCCtGG 2248.  3.73 ± 0.23  1.41 ± 0.07 G G GG AGGGAGTTTGCTCCtGG 2249. N.D. N.D. CACNA2D OT1-11 GGG GA GGGG AAGTTTGCTCCtGG 2250.  10.4 ± 0.7  3.61 ± 0.02 G GA GGGG A AGTTTGCTCCtGG2251. N.D. N.D. T3 GGTGAGTGAGTGTGTGCGTGtGG 2252. 54.08 ± 1.0222.97 ± 0.17 GAGTGAGTGTGTGCGTGtGG 2253. 50.49 ± 1.25 20.05 ± 0.01 VEGFAOT3-1 GGTGAGTGAGTGTGTG T GTGaGG 2254.  6.16 ± 0.98  6.02 ± 0.11GAGTGAGTGTGTG T GTGaGG 2255. N.D. N.D. (abParts) OT3-2 A GTGAGTGAGTGTGTGT GTGgGG 2256. 19.64 ± 1.06 11.29 ± 0.27 GAGTGAGTGTGTG T GTGgGG 2257. 5.52 ± 0.25  3.41 ± 0.07 MAX OT3-4 G C TGAGTGAGTGT A TGCGTGtGG 2258. 7.95 ± 0.11  4.50 ± 0.02 GAGTGAGTGT A TGCGTGtGG 2259.  1.69 ± 0.26 1.27 ± 0.10 OT3-9 GGTGAGTGAGTG C GTGCG G GtGG 2260. N.D.  1.09 ± 0.17GAGTGAGTG C GTGCG G GtGG 2261. N.D. N.D. TPCN2 OT3-17 G T TGAGTGA ATGTGTGCGTGaGG 2262.  1.85 ± 0.08 N.D. GAGTGA A TGTGTGCGTGaGG 2263. N.D.N.D. SLIT1 OT3-18 T GTG G GTGAGTGTGTGCGTGaGG 2264.  6.16 ± 0.56 6.27 ± 0.09 G G GTGAGTGTGTGCGTGaGG 2265. N.D. N.D. COMDA OT3-20 A G AGAGTGAGTGTGTGC A TGaGG 2266. 10.47 ± 1.08  4.38 ± 0.58 GAGTGAGTGTGTGC ATGaGG 2267. N.D. N.D. T4 GAGTCCGAGCAGAAGAAGAAgGG 2268. 41.56 ± 0.2012.65 ± 0.31 GTCCGAGCAGAAGAAGAAgGG 2269. 43.01 ± 0.87 17.25 ± 0.64 EMX1OT4-1 GAGT TA GAGCAGAAGAAGAAaGG 2270. 19.26 ± 0.73  4.14 ± 0.66 GT TAGAGCAGAAGAAGAAaGG 2271. N.D. N.D. HCN1 OT- GAGTC TA AGCAGAAGAAGAAg A G2272.  4.37 ± 0.58 N.D. GTC TA AGCAGAAGAAGAAg A G 2273. N.D. N.D. MFAP14_Hsu31 Mutation frequencies were measured by T7EI assay. Means ofduplicate measurements are shown with error bars representing standarderrors of the mean. *Off-target site OT4 53 is the same as EMX1 target 3OT31 from Hsu et al., 2013.

TABLE 3B Numbers of wild-type (WT) and indel mutation sequencing readsfrom deep sequencing experiments Control tru-RGN Standard RGN Site IndelWT Freq. Indel WT Freq. Indel WT Freq. VEGFA site 1 45 140169 0.03%122858 242127 33.66% 150652 410479 26.85% OT1-3 0 132152 0.00% 1595205878 0.77% 50973 144895 26.02% OT1-4 0 133508 0.00% 0 223881 0.00%22385 240873 8.50% OT1-6 3 213642 0.00% 339 393124 0.09% 24332 4244585.21% OT1-11 1 930894 0.00% 0 274779 0.00% 43738 212212 17.09% VEGFAsite 3 5 212571 0.00% 303913 292413 50.96% 183626 174740 51.24% OT3-21169 162545 0.71% 9415 277616 3.28% 26545 222482 10.66% OT3-4 7 3830060.00% 15551 1135673 1.35% 42699 546203 7.25% OT3-9 73 145367 0.05% 113227874 0.05% 1923 168293 1.13% OT3-17 8 460498 0.00% 31 1271276 0.00%16760 675708 2.42% OT3-18 7 373571 0.00% 284 1275982 0.02% 72354 59903010.78% OT3-20 5 140848 0.00% 593 325162 0.18% 30486 202733 13.07% EMX1site 1 1 158838 0.00% 49104 102805 32.32% 128307 307584 29.44% OT4-1 10169476 0.01% 13 234039 0.01% 47426 125683 27.40% OT4-52 2 75156 0.00% 10231090 0.00% 429 340201 0.13% OT4-53 0 234069 0.00% 6 367811 0.00% 17421351667 4.72% Freq. = frequency of indel mutations = number of indelsequences/number of wild-type sequences. Control gRNA = gRNA lacking acomplementarity region

TABLE 3CIndel mutation frequencies at potential off-target sites of tru-RGNstargeted to endogenous genes in human cells Indel mutation frequency SEQ(%) ± s.e.m. Target ID Number of U2OS.EGFP HEK293  ID Target Site + PAMNO: mismatches cells cells VEGFA GTGGGGGGAGTTTGCTCCtGG 2274.0 (on-target) 23.93 ± 4.37  8.34 ± 0.01 Site 1 GTGGGGGGAGTTTGCCCCaGG2275. 1 Not detected Not detected GTGGGGGGTGTTTGCTCCcGG 2276. 1Not detected Not detected GTGGGTGGAGTTTGCTACtGG 2277. 2 Not detectedNot detected GTGGGGGGAGCTTTCTCCtGG 2278. 2 Not detected Not detectedGTGGGTGGCGTTTGCTCCaGG 2279. 2 Not detected Not detectedGTGGAGGGAGCTTGCTCCtGG 2280. 2  6.88 ± 0.19 Not detectedGTGGGTGGAGTTTGCTACaGG 2281. 2 Not detected Not detectedGGGGGGGCAGTTTGCTCCtGG 2282. 2 Not detected Not detectedGTGTGGGGAATTTGCTCCaGG 2283. 2 Not detected Not detectedCTGCTGGGAGTTTGCTCCtGG 2284. 3 Not detected Not detectedTTTGGGAGAGTTTGCTCCaGG 2285. 3 Not detected Not detectedCTGAGGGCAGTTTGCTCCaGG 2286. 3 Not detected Not detectedGTAAGGGAAGTTTGCTCCtGG 2287. 3 Not detected Not detectedGGGGGTAGAGTTTGCTCCaGG 2288. 3 Not detected Not detectedGGGTGGGGACTTTGCTCCaGG 2289. 3 Not detected Not detectedGGGGGAGCAGTTTGCTCCaGG 2290. 3 Not detected Not detectedTTGGGGTTAGTTTGCTCCtGG 2291. 3 Not detected Not detectedTTGAGGGGAGTCTGCTCCaGG 2292. 3 Not detected Not detectedCTGGGGTGATTTTGCTCCtGG 2293. 3 Not detected Not detectedGAGAGGGGAGTTGGCTCCtGG 2294. 3 Not detected Not detectedTTTGGGGGAGTTTGCCCCaGG 2295. 3 Not detected Not detectedTTCGGGGGAGTTTGCGCCgGG 2296. 3 Not detected Not detectedCTCGGGGGAGTTTGCACCaGG 2297. 3 Not detected Not detectedGTGTTGGGAGTCTGCTCCaGG 2298. 3 Not detected Not detectedGAGGGGGCAGGTTGCTCCaGG 2299. 3 Not detected Not detectedGAGGGGAGAGTTTGTTCCaGG 2300. 3 Not detected Not detectedGTGGCTGGAGTTTGCTGCtGG 2301. 3 Not detected Not detectedGTCGGGGGAGTGGGCTCCaGG 2302. 3 Not detected Not detectedGAGGGGGGAGTGTGTTCCgGG 2303. 3 Not detected Not detectedGTGGTGGGAGCTTGTTCCtGG 2304. 3 Not detected Not detectedGTGGGGGGTGCCTGCTCCaGG 2305. 3 Not detected Not detected VEGFAGAGTGAGTGTGTGCGTGtGG 2306.0 (on-target) 50.49 ± 1.25 20.05 ± 0.01 Site 3CAGTGAGTGTGTGCGTGtGG 2307. 1 Not detected Not detectedGTGTGAGTGTGTGCGTGgGG 2308. 1 Not detected Not detectedGTGTGAGTGTGTGCGTGaGG 2309. 1 Not detected Not detectedGTGTGAGTGTGTGCGTGtGG 2310. 1 Not detected Not detectedGAGTGTGTGTGTGCGTGtGG 2311. 1 Not detected Not detectedGAGTGGGTGTGTGCGTGgGG 2312. 1 Not detected Not detectedGAGTGACTGTGTGCGTGtGG 2313. 1 Not detected Not detectedGAGTGAGTGTGTGGGTGgGG 2314. 1 Not detected Not detectedGAGTGAGTGTGTGTGTGtGG 2315. 1 Not detected Not detectedGAGTGAGTGTGTGTGTGtGG 2316. 1 Not detected Not detectedGAGTGAGTGTGTGTGTGgGG 2317. 1 Not detected Not detectedGAGTGAGTGTGTGTGTGtGG 2318. 1 Not detected Not detectedGAGTGAGTGTGTGCGCGgGG 2319. 1 Not detected Not detectedCTGTGAGTGTGTGCGTGaGG 2320. 2 Not detected Not detectedATGTGAGTGTGTGCGTGtGG 2321. 2 Not detected Not detectedGCCTGAGTGTGTGCGTGtGG 2322. 2 Not detected Not detectedGTGTGTGTGTGTGCGTGtGG 2323. 2 Not detected Not detectedGTGTGGGTGTGTGCGTGtGG 2324. 2 Not detected Not detectedGCGTGTGTGTGTGCGTGtGG 2325. 2 Not detected Not detectedGTGTGTGTGTGTGCGTGgGG 2326. 2 Not detected Not detectedGTGTGCGTGTGTGCGTGtGG 2327. 2 Not detected Not detectedGTGTGTGTGTGTGCGTGcGG 2328. 2 Not detected Not detectedGAGAGAGAGTGTGCGTGtGG 2329. 2 Not detected Not detectedGAGTGTGTGAGTGCGTGgGG 2330. 2 Not detected Not detectedGTGTGAGTGTGTGTGTGtGG 2331. 2 Not detected Not detectedGAGTGTGTGTATGCGTGtGG 2332. 2 Not detected Not detectedGAGTCAGTGTGTGAGTGaGG 2333. 2 Not detected Not detectedGAGTGTGTGTGTGAGTGtGG 2334. 2 Not detected Not detectedGAGTGTGTGTGTGCATGtGG 2335. 2 Not detected Not detectedGAGTGAGAGTGTGTGTGtGG 2336. 2 Not detected Not detectedGAGTGAGTGAGTGAGTGaGG 2337. 2 Not detected Not detected EMX1GTCCGAGCAGAAGAAGAAgGG 2338. 0 (on-target) 43.01 ± 0.87 17.25 ± 0.64 siteGTCTGAGCAGAAGAAGAAtGG 2339. 1 Not detected Not detectedGTCCCAGCAGTAGAAGAAtGG 2340. 2 Not detected Not detectedGTCCGAGGAGAGGAAGAAaGG 2341. 2 Not detected Not detectedGTCAGAGGAGAAGAAGAAgGG 2342. 2 Not detected Not detectedGACAGAGCAGAAGAAGAAgGG 2343. 2 Not detected Not detectedGTGGGAGCAGAAGAAGAAgGG 2344. 2 Not detected Not detectedGTACTAGCAGAAGAAGAAaGG 2345. 2 Not detected Not detectedGTCTGAGCACAAGAAGAAtGG 2346. 2 Not detected Not detectedGTGCTAGCAGAAGAAGAAgGG 2347. 2 Not detected Not detectedTACAGAGCAGAAGAAGAAtGG 2348. 3 Not detected Not detectedTACGGAGCAGAAGAAGAAtGG 2349. 3 Not detected Not detectedAACGGAGCAGAAGAAGAAaGG 2350. 3 Not detected Not detectedGACACAGCAGAAGAAGAAgGG 2351. 3 Not detected Not detectedCTGCGATCAGAAGAAGAAaGG 2352. 3 Not detected Not detectedGACTGGGCAGAAGAAGAAgGG 2353. 3 Not detected Not detectedTTCCCTGCAGAAGAAGAAaGG 2354. 3 Not detected Not detectedTTCCTACCAGAAGAAGAAtGG 2355. 3 Not detected Not detectedCTCTGAGGAGAAGAAGAAaGG 2356. 3 Not detected Not detectedATCCAATCAGAAGAAGAAgGG 2357. 3 Not detected Not detectedGCCCCTGCAGAAGAAGAAcGG 2358. 3 Not detected Not detectedATCCAACCAGAAGAAGAAaGG 2359. 3 Not detected Not detectedGACTGAGAAGAAGAAGAAaGG 2360. 3 Not detected Not detectedGTGGGATCAGAAGAAGAAaGG 2361. 3 Not detected Not detectedGACAGAGAAGAAGAAGAAaGG 2362. 3 Not detected Not detectedGTCATGGCAGAAGAAGAAaGG 2363. 3 Not detected Not detectedGTTGGAGAAGAAGAAGAAgGG 2364. 3 Not detected Not detectedGTAAGAGAAGAAGAAGAAgGG 2365. 3 Not detected Not detectedCTCCTAGCAAAAGAAGAAtGG 2366. 3 Not detected Not detectedTTCAGAGCAGGAGAAGAAtGG 2367. 3 Not detected Not detectedGTTGGAGCAGGAGAAGAAgGG 2368. 3 Not detected Not detectedGCCTGAGCAGAAGGAGAAgGG 2369. 3 Not detected Not detectedGTCTGAGGACAAGAAGAAtGG 2370. 3 Not detected Not detectedGTCCGGGAAGGAGAAGAAaGG 2371. 3 Not detected Not detectedGGCCGAGCAGAAGAAAGAcGG 2372. 3 Not detected Not detectedGTCCTAGCAGGAGAAGAAgAG 2373. 3 Not detected Not detected

TABLE 3DFrequencies of tru-RGN-induced indel mutations at potential off-target sites in human U2OS.EGFP as determined by deep sequencing On-target Off- tru-RGN Control site target site sequence S# Indel WT Freq.Indel WT Freq VEGFA GTGGGGGGAGTTTGC C CCaGG 2374. 1500 225640 0.66%   3135451 0.00% site 1 GTGGGGGG T GTTTGCTCCcGG 2375. 1552 152386 1.01%   0 86206 0.00% GTGGG T GGAGTTTGCT A CtGG 2376.    1 471818 0.00%   0199581 0.00% GTGGG T GGAGTTTGCT A CaGG 2377.    0 337298 0.00%   1211547 0.00% GTGGG T GG C GTTTGCTCCaGG 2378.    2 210174 0.00%   1105531 0.00% GTG T GGGGA A TTTGCTCCaGG 2379.  673 715547 0.09%   1387097 0.00% GTGGGGGGAG C TT T CTCCtGG 2380.    5 107757 0.00%   1 58735 0.00% G G GGGGG C AGTTTGCTCCtGG 2381. 1914 566548 0.34%   3297083 0.00% VEGFA G T GTGAGTGTGTGCGTGtGG 2382.   58 324881 0.02%   9122216 0.01% site 3 G T GTGAGTGTGTGCGTGaGG 2383.  532 194914 0.27%  11 73644 0.01% GAGTG G GTGTGTGCGTGgGG 2384.   70 237029 0.03%  10 1782580.01% GAGTGA C TGTGTGCGTGtGG 2385.    6 391894 0.00%   0 239460 0.00%GAGTGAGTGTGTG G GTGgGG 2386.   15 160140 0.01%  10 123324 0.01% G TGTGAGTGTGTGCGTGgGG 2387.   19 138687 0.01%   1 196271 0.00% CAGTGAGTGTGTGCGTGtGG 2388.   78 546865 0.01%  41 355953 0.01% G TGTGAGTGTGTGCGTGtGG 2389.  128 377451 0.03%  56 133978 0.04% GAGTG TGTGTG T GCGTGtGG 2390.  913 263028 0.35%  78 178979 0.04% GAGTGAGTGTG TGTGTGtGG 2391.   40 106933 0.04%  36  58812 0.06% GAGTGAGTGTG T GTGTGtGG2392.  681 762999 0.09%  63 222451 0.03% GAGTGAGTGTG T GTGTGgGG 2393. 331 220289 0.15% 100 113911 0.09% GAGTGAGTGTG T GTGTGtGG 2394.    0 35725 0.00%   8 186495 0.00% GAGTGAGTGTGTGCG C GgGG 2395.   94 2468930.04%  16 107623 0.01% EMX1 GTC A GAG G AGAAGAAGAAgGG 2396.    0 2014830.00%   4 148416 0.00% site 1 GTC A GAG G AGAAGAAGAAgGG 2397.   10545662 0.00%   5 390884 0.00% GTC T GAGCA C AAGAAGAAtGG 2398.    2274212 0.00%   0 193837 0.00% GTC T GAGCAGAAGAAGAAtGG 2399.  440 3756460.12%  10 256181 0.00% G A C A GAGCAGAAGAAGAAgGG 2400.    2 212472 0.00%  1 158860 0.00% GT A C T AGCAGAAGAAGAAaGG 2401.  152 229209 0.07% 103157717 0.07% GT GG GAGCAGAAGAAGAAgGG 2402.   50 207401 0.02%  36 1111830.03% GTCC C AGCAG T AGAAGAAtGG 2403.    0 226477 0.00%   1 278948 0.00%

Example 2d. Tru-gRNAs can be Used with Dual Cas9 Nickases to EfficientlyInduce Genome Editing in Human Cells

tru-gRNAs were tested with the recently described dual Cas9 nickaseapproach to induce indel mutations. To do this, the Cas9-D10A nickasetogether with two full-length gRNAs targeted to sites in the human VEGFAgene (VEGFA site 1 and an additional sequence we refer to as VEGFA site4) were co-expressed in U2OS.EGFP cells (FIG. 4A). As describedpreviously (Ran et al., 2013), this pair of nickases functionedcooperatively to induce high rates of indel mutations at the VEGFAtarget locus (FIG. 4B). Interestingly, Cas9-D10A nickase co-expressedwith only the gRNA targeted to VEGFA site 4 also induced indel mutationsat a high frequency, albeit at a rate somewhat lower than that observedwith the paired full-length gRNAs (FIG. 4B). Importantly, use of atru-gRNA for VEGFA site 1 in place of a full-length gRNA did not affectthe efficacy of the dual nickase approach to induce indel mutations(FIG. 4B).

The dual nickase strategy has also been used to stimulate theintroduction of specific sequence changes using ssODNs (Mali et al.,2013a; Ran et al., 2013) and so whether tru-gRNAs might be used for thistype of alteration was also tested. Paired full-length gRNAs for VEGFAsites 1 and 4 together with Cas9-D10A nickase cooperatively enhancedefficient introduction of a short insertion from a ssODN donor (FIG. 3A)into the VEGFA locus in human U2OS.EGFP cells as expected (FIG. 3C).Again, the efficiency of ssODN-mediated sequence alteration by dualnicking remained equally high with the use of a tru-gRNA in place of thefull-length gRNA targeted to VEGFA site 1 (FIG. 3C). Taken together,these results demonstrate that tru-gRNAs can be utilized as part of adual Cas9 nickase strategy to induce both indel mutations andssODN-mediated sequence changes, without compromising the efficiency ofgenome editing by this approach.

Having established that use of a tru-gRNA does not diminish theon-target genome editing activities of paired nickases, we next useddeep sequencing to examine mutation frequencies at four previouslyidentified bona fide off-target sites of the VEGFA site 1 gRNA. Thisanalysis revealed that mutation rates dropped to essentiallyundetectable levels at all four of these off-target sites when usingpaired nickases with a tru-gRNA (Table 4). By contrast, neither atru-RGN (Table 3B) nor the paired nickases with full-length gRNAs (Table4) was able to completely eliminate off-target mutations at one of thesefour off-target sites (OT1-3). These results demonstrate that the use oftru-gRNAs can further reduce the off-target effects of paired Cas9nickases (and vice versa) without compromising the efficiency ofon-target genome editing.

TABLE 4 Frequencies of paired nickase-induced indel mutations at on- andoff-target sites of VEGFA site 1 using full-length and tru-gRNAs Pairedfull-length gRNAs tru-gRNA/full-length gRNA Control Site Indel WT Freq.Indel WT Freq. Indel WT Freq. VEGFA site 1 78905 345696 18.583% 65754280720 18.978% 170 308478 0.055% OT1-3 184 85151 0.216% 0 78658 0.000% 2107850 0.002% OT1-4 0 89209 0.000% 1 97010 0.001% 0 102135 0.000% OT1-62 226575 0.001% 0 208218 0.000% 0 254580 0.000% OT1-11 0 124729 0.000% 0121581 0.000% 0 155173 0.000%

REFERENCES

-   Cheng, A. W., Wang, H., Yang, H., Shi, L., Katz, Y., Theunissen, T.    W., Rangarajan, S., Shivalila, C. S., Dadon, D. B., and Jaenisch, R.    Multiplexed activation of endogenous genes by CRISPR-on, an    RNA-guided transcriptional activator system. Cell Res 23, 1163-1171.    (2013).-   Cho, S. W., Kim, S., Kim, J. M. & Kim, J. S. Targeted genome    engineering in human cells with the Cas9 RNA-guided endonuclease.    Nat Biotechnol 31, 230-232 (2013). Cong, L. et al. Multiplex genome    engineering using CRISPR/Cas systems. Science 339, 819-823 (2013).-   Cradick, T. J., Fine, E. J., Antico, C. J., and Bao, G. CRISPR/Cas9    systems targeting beta-globin and CCR5 genes have substantial    off-target activity. Nucleic Acids Res. (2013).-   Dicarlo, J. E. et al. Genome engineering in Saccharomyces cerevisiae    using CRISPR-Cas systems. Nucleic Acids Res (2013).-   Ding, Q., Regan, S. N., Xia, Y., Oostrom, L. A., Cowan, C. A., and    Musunuru, K. Enhanced efficiency of human pluripotent stem cell    genome editing through replacing TALENs with CRISPRs. Cell Stem Cell    12, 393-394. (2013).-   Fisher, S., Barry, A., Abreu, J., Minie, B., Nolan, J., Delorey, T.    M., Young, G., Fennell, T. J., Allen, A., Ambrogio, L., et al. A    scalable, fully automated process for construction of sequence-ready    human exome targeted capture libraries. Genome Biol 12, R1. (2011).-   Friedland, A. E., Tzur, Y. B., Esvelt, K. M., Colaiacovo, M. P.,    Church, G. M., and Calarco, J. A. Heritable genome editing in C.    elegans via a CRISPR-Cas9 system. Nat Methods 10, 741-743. (2013).-   Fu, Y., Foden, J. A., Khayter, C., Maeder, M. L., Reyon, D.,    Joung, J. K., and Sander, J. D. High-frequency off-target    mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat    Biotechnol 31, 822-826. (2013).-   Gabriel, R. et al. An unbiased genome-wide analysis of zinc-finger    nuclease specificity. Nat Biotechnol 29, 816-823 (2011).-   Gilbert, L. A., Larson, M. H., Morsut, L., Liu, Z., Brar, G. A.,    Torres, S. E., Stern-Ginossar, N., Brandman, O., Whitehead, E. H.,    Doudna, J. A., et al. (2013). CRISPR-Mediated Modular RNA-Guided    Regulation of Transcription in Eukaryotes. Cell 154, 442-451.-   Gratz, S. J. et al. Genome engineering of Drosophila with the CRISPR    RNA-guided Cas9 nuclease. Genetics (2013).-   Hockemeyer, D. et al. Genetic engineering of human pluripotent cells    using TALE nucleases. Nat Biotechnol 29, 731-734 (2011).-   Horvath, P. & Barrangou, R. CRISPR/Cas, the immune system of    bacteria and archaea. Science 327, 167-170 (2010).-   Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann,    S., Agarwala, V., Li, Y., Fine, E. J., Wu, X., Shalem, O., et al.    DNA targeting specificity of RNA-guided Cas9 nucleases. Nat    Biotechnol 31, 827-832. (2013).-   Hwang, W. Y. et al. Efficient genome editing in zebrafish using a    CRISPR-Cas system. Nat Biotechnol 31, 227-229 (2013).-   Hwang, W. Y., Fu, Y., Reyon, D., Maeder, M. L., Kaini, P.,    Sander, J. D., Joung, J. K., Peterson, R. T., and Yeh, J. R.    Heritable and Precise Zebrafish Genome Editing Using a CRISPR-Cas    System. PLoS One 8, e68708. (2013a).-   Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A.    RNA-guided editing of bacterial genomes using CRISPR-Cas systems.    Nat Biotechnol 31, 233-239 (2013). Jinek, M. et al. A programmable    dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.    Science 337, 816-821 (2012).-   Jinek, M. et al. RNA-programmed genome editing in human cells. Elife    2, e00471 (2013).-   Li, D., Qiu, Z., Shao, Y., Chen, Y., Guan, Y., Liu, M., Li, Y., Gao,    N., Wang, L., Lu, X., et al. Heritable gene targeting in the mouse    and rat using a CRISPR-Cas system. Nat Biotechnol 31, 681-683.    (2013a).-   Li, W., Teng, F., Li, T., and Zhou, Q. Simultaneous generation and    germline transmission of multiple gene mutations in rat using    CRISPR-Cas systems. Nat Biotechnol 31, 684-686. (2013b).-   Maeder, M. L., Linder, S. J., Cascio, V. M., Fu, Y., Ho, Q. H., and    Joung, J. K. CRISPR RNA-guided activation of endogenous human genes.    Nat Methods 10, 977-979. (2013).-   Mali, P., Aach, J., Stranges, P. B., Esvelt, K. M., Moosburner, M.,    Kosuri, S., Yang, L., and Church, G. M. CAS9 transcriptional    activators for target specificity screening and paired nickases for    cooperative genome engineering. Nat Biotechnol 31, 833-838. (2013a).-   Mali, P., Esvelt, K. M., and Church, G. M. Cas9 as a versatile tool    for engineering biology. Nat Methods 10, 957-963. (2013b).-   Mali, P. et al. RNA-guided human genome engineering via Cas9.    Science 339, 823-826 (2013c).-   Pattanayak, V., Lin, S., Guilinger, J. P., Ma, E., Doudna, J. A.,    and Liu, D. R. High-throughput profiling of off-target DNA cleavage    reveals RNA-programmed Cas9 nuclease specificity. Nat Biotechnol 31,    839-843. (2013).-   Pattanayak, V., Ramirez, C. L., Joung, J. K. & Liu, D. R. Revealing    off-target cleavage specificities of zinc-finger nucleases by in    vitro selection. Nat Methods 8, 765-770 (2011).-   Perez, E. E. et al. Establishment of HIV-1 resistance in CD4+ T    cells by genome editing using zinc-finger nucleases. Nat Biotechnol    26, 808-816 (2008).-   Perez-Pinera, P., Kocak, D. D., Vockley, C. M., Adler, A. F.,    Kabadi, A. M., Polstein, L. R., Thakore, P. I., Glass, K. A.,    Ousterout, D. G., Leong, K. W., et al. RNA-guided gene activation by    CRISPR-Cas9-based transcription factors. Nat Methods 10, 973-976.    (2013).-   Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A.,    Weissman, J. S., Arkin, A. P., and Lim, W. A. Repurposing CRISPR as    an RNA-guided platform for sequence-specific control of gene    expression. Cell 152, 1173-1183. (2013).-   Ran, F. A., Hsu, P. D., Lin, C. Y., Gootenberg, J. S., Konermann,    S., Trevino, A. E., Scott, D. A., Inoue, A., Matoba, S., Zhang, Y.,    et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome    editing specificity. Cell 154, 1380-1389. (2013). Reyon, D. et al.    FLASH assembly of TALENs for high-throughput genome editing. Nat    Biotech 30, 460-465 (2012).-   Sander, J. D., Maeder, M. L., Reyon, D., Voytas, D. F., Joung, J.    K., and Dobbs, D. ZiFiT (Zinc Finger Targeter): an updated zinc    finger engineering tool. Nucleic Acids Res 38, W462-468. (2010).-   Sander, J. D., Ramirez, C. L., Linder, S. J., Pattanayak, V.,    Shoresh, N., Ku, M., Foden, J. A., Reyon, D., Bernstein, B. E.,    Liu, D. R., et al. In silico abstraction of zinc finger nuclease    cleavage profiles reveals an expanded landscape of off-target sites.    Nucleic Acids Res. (2013).-   Sander, J. D., Zaback, P., Joung, J. K., Voytas, D. F., and    Dobbs, D. Zinc Finger Targeter (ZiFiT): an engineered zinc    finger/target site design tool. Nucleic Acids Res W599-605. (2007).-   Shen, B. et al. Generation of gene-modified mice via    Cas9/RNA-mediated gene targeting. Cell Res (2013).-   Sugimoto, N. et al. Thermodynamic parameters to predict stability of    RNA/DNA hybrid duplexes. Biochemistry 34, 11211-11216 (1995).-   Terns, M. P. & Terns, R. M. CRISPR-based adaptive immune systems.    Curr Opin Microbiol 14, 321-327 (2011).-   Wang, H. et al. One-Step Generation of Mice Carrying Mutations in    Multiple Genes by CRISPR/Cas-Mediated Genome Engineering. Cell 153,    910-918 (2013).-   Wiedenheft, B., Sternberg, S. H. & Doudna, J. A. RNA-guided genetic    silencing systems in bacteria and archaea. Nature 482, 331-338    (2012).-   Yang, L., Guell, M., Byrne, S., Yang, J. L., De Los Angeles, A.,    Mali, P., Aach, J., Kim-Kiselak, C., Briggs, A. W., Rios, X., et al.    (2013). Optimization of scarless human stem cell genome editing.    Nucleic Acids Res 41, 9049-9061.

OTHER EMBODIMENTS

It is to be understood that while the invention has been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

1.-28. (canceled)
 29. A complex comprising: a Streptococcus pyogenesCas9 protein; and a Streptococcus pyogenes gRNA molecule that includes acomplementarity region at the 5′ end of the gRNA consisting of 17-18nucleotides that are complementary to 17-18 consecutive nucleotides ofthe complementary strand of a target sequence, wherein the targetsequence is immediately 5′ of a protospacer adjacent motif, wherein thegRNA is a single gRNA, and wherein in the presence of an S. pyogenesCas9 protein, the gRNA complementary region binds and directs the Cas9protein to the target sequence.
 30. The complex of claim 29, wherein theCas9 protein is a catalytically active nuclease.
 31. The complex ofclaim 29, wherein the Cas9 protein is a catalytically active nickase.32. The complex of claim 29, wherein the Cas9 protein is a catalyticallyinactive Cas9 protein.
 33. The complex of claim 29, wherein thecomplementarity region of the gRNA consists of 17 nucleotides.
 34. Thecomplex of claim 29, wherein the complementarity region of the gRNAconsists of 18 nucleotides.
 35. A DNA molecule encoding the complex ofclaim
 29. 36. A vector comprising the DNA molecule of claim
 35. 37. Ahost cell expressing the vector of claim
 36. 38. A complex comprising: aStreptococcus pyogenes Cas9 protein; and a Streptococcus pyogenes gRNAmolecule that includes a complementarity region at the end of the gRNAconsisting of 17-18 nucleotides that are complementary to 17-18consecutive nucleotides of the complementary strand of a targetsequence, wherein the target sequence is immediately 5′ of a protospaceradjacent motif, wherein the gRNA is a CRISPR RNA (crRNA), and wherein inthe presence of an S. pyogenes Cas9 protein, the gRNA complementaryregion binds and directs the Cas9 protein to the target sequence.
 39. Avector comprising a DNA molecule encoding: a Streptococcus pyogenes gRNAmolecule that includes a complementarity region at the end of the gRNAconsisting of 17-18 nucleotides that are complementary to 17-18consecutive nucleotides of the complementary strand of a targetsequence, wherein the target sequence is immediately 5′ of a protospaceradjacent motif, wherein the gRNA is a single gRNA or a CRISPR RNA(crRNA), and wherein in the presence of an S. pyogenes Cas9 nuclease,the gRNA complementary region binds and directs the Cas9 nuclease to thetarget sequence; and a promoter.
 40. The vector of claim 39, wherein thepromoter is a constitutive promoter.
 41. The vector of claim 39, whereinthe promoter is an inducible promoter.
 42. The vector of claim 39,wherein the promoter is a weak promoter.
 43. The vector of claim 39,wherein the expression vector is a bacterial expression vector.
 44. Thevector of claim 39, wherein the expression vector is a eukaryoticexpression vector.
 45. The vector of claim 39, wherein the promoter isan RNA Pol III promoter.
 46. The vector of claim 39, wherein thepromoter is selected from an H1, U6, or 7SK promoter.
 47. The vector ofclaim 39, wherein the promoter is a T7 promoter.
 48. A host cellexpressing the vector of claim 47.